Part Number Hot Search : 
UPA651TT P1500ECL CM88L70 XE1401G PG540010 MB101 XE1401G 19800
Product Description
Full Text Search
 

To Download AUXPWR Datasheet File

  If you can't view the Datasheet, Please click here to try to view without PDF Reader .  
 
 


  Datasheet File OCR Text:
  revision: 2.63 december 2011 intel ? 82576eb gigabit ethernet controller datasheet lan access division (lad) product features external interfaces ? pcie* v2.0 (2.5 gt/s) x4/x2/x1; called pcie in this document ? mdi (copper) standard ieee 802.3 ethernet interface for 1000base-t, 100base-tx, and 10base-t applications (802.3, 802.3u, and 802.3ab) ? serializer-deserializer (serdes) to support 1000base- sx/x/lx (optical fiber) for gigabit backplane applications. ? sgmii for sfp/external phy connections ? nc-si (type c) or smbus for manageability connection to bmc. ? ieee 1149.1 jtag intel? i/o acceleration technology ? stateless offloads (header split, rss) ? intel? quickdata (dca - direct cache access) virtualization ready ? next generation vmdq support (8 vms) ? pci-sig single root i/o virtualization (direct assignment) ? queues per port: 16 tx queues and 16 rx queues full-spectrum security ? ipsec (256 sa?s) in 82576eb; ipsec not present in 82576ns [non-security] ? macsec additional product details ? 25mm x 25mm package ? power 2.8w (max) ? support for pci 3.0 vital product data ? memories parity or ecc protection ? ipmi mc pass-thru; multi-drop nc-si ? 802.1as draft standard implementation ? layout compatible with 82575
intel ? 82576eb gbe controller ? legal intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 2 legal information in this document is provided in connection with intel? products. no license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted by this document. except as provided in intel's terms and conditions of sale for such products, intel assumes no liability whatsoever, and intel disclaims any express or implied warranty, relating to sale and/or use of intel products including liability or warranties relating to fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. intel products are not intended for use in medical, life saving, life sustaining, critical control or safety systems, or in nuclear facility applications. intel may make changes to specifications and product descriptions at any time, without notice. intel corporation may have patents or pending patent applications, trademarks, copyrights, or other intellectual property right s that relate to the presented subject matter. the furnishing of documents and other materials and information does not provide any license, express or implied, by estoppel or otherwise, to any such patents, trademarks, copyrights, or other intellectual prope rty rights. designers must not rely on the absence or characteristics of any features or instructions marked ?reserved? or ?undefined.? int el reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising fro m future changes to them. contact your local intel sales office or your distributor to obtain the latest specifications and before placing your product o rder. copies of documents which have an order number and are referenced in this document, or other intel literature may be obtained b y calling 1-800-548-4725 or by visiting intel's website at http://www.intel.com . intel and intel logo are trademarks or registered trademarks of intel corporation or its subsidiaries in the united states and other countries. *other names and brands may be claimed as the property of others. copyright ? 2007, 2008, 2009, 2010, 2011; intel corporation. all rights reserved.
revisions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 3 revisions revision date comments 0.5 6/2007 initial availability. 1.0 11/2007 updates and corrections. 1.9 5/2008 prq release. 2.0 6/2008 sra release. 2.1 7/2008 maintenance update. added checklist chapter. 2.2 11/2008 maintenance update. ? ected device id reference to 0x10c9. ? section 3.3.1.7 ; section 12.3.2.2.1 - eeprom-less information updated; stronger statements about eeprom-less design. ? table 3-17 - device id corrected. ? gio_pwr_good updated to perst# throughout. ? section 6.1 - more pxe information documented. entire section updated. see pxe listings on eeprom map. also, links added for entire eeprom reference map. ? section 7.10.3.5.1 , section 7.10.3.5.2 - notes added after vfre filtering paragraphs in numbered list. ? section 8.8.7 , section 8.8.8 , section 8.8.9 , section 8.8.10 - the icr, ics, ims, imc registers were corrected. see bit 3 in each. ? chapter 10.0, system manageability updated; organization changed; some additional information provided. ? section 10.6.2.12 - bit description in table updated (to 0x21). ? table 10-10 - ipv4 and ipv6 filter parameter information corrected. ? table 10-33 - list of supported commands has been updated. ? table 11.4.2.1 - current consumption data updated. see bold text in table. also, see power data in summary on title page. ? table 12-2 - additional magnetics recommendation added. 2.3 12/2008 ? section 6.2.18 - bit 15 information updated; enable wake# assertion. 2.4 4/1/2009 ? jumbo frame size consistently indicated at 9500 bytes (max). ? sku 82576ns documented. the ipsec function is present in the 82576eb sku. ipsec is not present in the 82576ns sku. this is indicated throughout the document. ? section 3.3.4.2, flash write control - typing correction. note that attempts to write to the flash device when writes are disabled (eec.fwe= 01b ) should not be attempted. ? section 3.4.2, software watchdog - updated. edited to describe the software interrupt (icr[26]) and to reduce confusion. ? section 3.5.6.5.1, setting the 82576 to external phy loopback mode - text added at the end of the section for clarity: the above procedure puts the device in phy loopback mode. after using the procedure, wait for link to become up. once phy register 1 bit 2 is set (this can take up to 750ms), transmit and receive normally. if you are unable to get link after 750ms, reset the phy using ctrl.phy_rst and then repeat the above procedure. when exiting external phy loopback mode, a full phy reset must be done. use ctrl.phy_rst.
intel ? 82576eb gbe controller ? revisions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 4 ? section 4.4, device disable - the following phrase in the section has been changed: the eeprom "power down enable" bit ( section 6.2.7 ) enables device disable mode (hardware default is that the mode is disabled). ? table 4-5, 82576 reset effects - per function resets - table updated. see the entries on pci configuration registers and the associated footnotes. ? section 4.2.1.6.3, vf software reset - replaced vfctrl with vtctrl (corrects a typo). added information that indicates what happens when vtctrl.rst is set. setting vtctrl.rst resets interrupts and queue enable bits. other vf registers are not reset. ? section 5.0, power management updated for clarity. ? section 6.10.7.1, iscsi module structure - description of structure updated. multiple errors were corrected ? section 7.1.3.1, host buffers - text added. for advanced descriptor usage, the srrctl.bsizeheader field is used to define the size of the buffers allocated to headers. the maximum buffer size supported is 960 bytes.. ? section 8.2.4, mdi control register - mdic (0x00020; r/w) - description of bit 31 corrected. ? section 8.10.2, split and replication receive control - srrctl (0x0c00c + 0x40*n [n=0...15]; r/w) . maximum 960 bytes now indicated for srrctl.bsizeheader. ? section 10.4.4.3, rmcp filtering - title of section updated. ? section 10.5.10.1.4, force tco command and section 10.6.2.13.1, perform intel tco reset command (intel command 0x22) - added description of reset_mgmt bit. ? section 10.5.12, example configuration steps - added pseudocode describing the setup of common filtering configurations. ? table 10-35, command summary - commands added, see: 0x02 0x67/68 set ethertype filter/packet add. ext. filter 0x03 0x67/68 get ethertype filter/packet add. ext. filter ? section 10.5.10.2.1, receive tco lan packet transaction . description of packet structure added. ? section 10.6.2.6.19, set intel filters - packet addition extended decision filter command (intel command 0x02, filter parameter 0x68) . text in section updated: extended decision filter index range adjusted to 0..4. ? table 11-5, current consumption details - added sgmii note to table. (3) to estimate power for sgmii mode, use the serdes mode power numbers provided. ? table 11-22, package height - table added. provides a summary of package height information. 2.41 4/8/2009 5/5/2009 ? section 7.1.4, legacy receive descriptor format and section 7.2.2, transmit descriptors . recommendation regarding legacy descriptors changed to ?must not be used? from ?should not be used.? 2.42 7/5/2009 internal release for test and review. 2.43 10/2/2009 macsec capability exposed. you must have a macsec-ready switch in order to com- plete the ecosystem and make use of macsec functionality. maintenance issues addressed: ? section 7.2.4.7.2, tcp/ip/udp headers for the subsequent frames and section 7.2.4.7.3, tcp/ip/udp headers for the last frame updated to document udp fields. ? section 7.3.3.2, interrupt moderation and section 8.8.12, interrupt throttle - eitr (0x01680 + 4*n [n = 0...24]; r/w) updated to correct minor issues; redundant data removed. ? table 7-9, vlan tag field layout (for 802.1q packet) - note added to table that clarifies usage: ? note: this table is relevant only if vmvir.vlana = 00b (use descriptor command) for the queue. revision date comments
revisions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 5 ? section 7.10.3.2.1, filtering capabilities - typo corrected. in bullet, vm changed to vf. below: ? promiscuous multicast & enable broadcast per vf. ? section 7.10.3.8, offloads - note added; text below: ? note: vlan strip offload is determined based only on the l2 mac address. in order to make sure vlan strip offload is correctly applied, all packets should be initially forwarded using one of the l2 mac address filters (rah/ral, uta, mta, vmolr.bam, vmolr.mpe. ? two table titles corrected. could have caused confusion. minor edits also made to field descriptions. ? table 7-35, tcp/ip or udp/ip packet format sent by host ? table 7-36, tcp/ip or udp/ip packet format sent by 82576 ? section 8.10.7, receive descriptor ring length - rdlen (0x0c008 + 0x40*n [n=0...15]; r/w) - description updated. len text added: the maximum allowed value is 0x80000 (32k descriptors). ? section 8.12.2, transmit control extended - tctl_ext (0x0404; r/w) - default value of cold corrected (0x42) in text description. ? section 10.5.10.1.4, force tco command - clarification note added to table. see below: ? note: before initiating a firmware reset command, one should disable tco receive via receive enable command -- setting rcv_en to 0 -- and wait for 200 milliseconds before initiating firmware reset command. in addition, the mcshould not transmit during this period. ? section 10.5.10.2.1, receive tco lan packet transaction - receive tco packet format table updated; numerous changes. for clarity. ? section 10.7.10, read fail-over configuration host command - both tables in section updated. ? table 10-49, commands to read the fail-over configuration register - last row in table deleted; was incorrect. ? table 10-50, states returned - description column (byte 1) updated. description was confusing. ? section 10.5.12.3.1, example 3 - pseudo code - pseudo code, step 5: mac address filtering is bit 0, not bit 1. also the mdef value is 00000009 and not 00000040. ? section 10.5.12.4.1, example 4 - pseudo code - step 5: configure mdef[0], mdef value is 0000004 and not 00000040. 2.44 10/14/2009 ? section 9.6.4.3, pcie sr-iov control register (0x168; rw) ; bit 4; ari capable hierarchy. text updated. ? section 10.0, system manageability ; more information on macsec parameters provided. see section 10.5.10.1.6, update macsec parameters and section 10.8, macsec and manageability in particular. ? section 10.5.10.1.3, receive enable command ; section 10.5.10.2.5, read management receive filter parameters . bit order expression corrected in two tables. see bold text. ? references to bmc changed to mc if the reference is not programmatic. 2.45 10/30/2009 ? section 3.3.1.6, eeprom recovery . section now exposed in the datasheet. ? section 8.10.8, receive descriptor head - rdh (0x0c010 + 0x40*n [n=0...15]; ro) and section 8.12.11, transmit descriptor head - tdh (0x0e010 + 0x40*n [n=0...15]; ro) . both registers indicated rw incorrectly. changed to ro. ? table 10-33, supported nc-si commands and table 10-34, optional nc-si features support . list of supported commands/functions updated to correct an error in our support statements. see bold text in both tables. revision date comments
intel ? 82576eb gbe controller ? revisions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 6 2.46 12/1/2009 ? table 7-18 , table 7-39 , table 7-41 . ?packet is greater than 1552 bytes; (lpe=1b).? updated to ?packet is greater than 1518/1522/1526 bytes; (lpe=1b).? ? chapter 8.0, receive control register - rctl (0x00100; r/w) . description of lpe field updated. ? chapter 10.0, system manageability . changes and clarifications to list of nc-si commands. added the get ethertype and get intel filters - packet addition extended decision filter commands. added the set/get unicast/broadcast/ multicast packet reduction filters. added a recommendation to use the packet addition extended decision filter commands (0x68) instead of the packet addition decision filter commands (0x61). 2.47 3/10/2010 ? chapter 5.0, power management . in tables where these fields occur, the following fields have been flipped to reflect this order. they were previously reversed in the tables. ? possible vlan tag ? possible llc/snap header ? chapter 5.0, power management . table 5-5 through table 5-10 ; offset and byte information has been updated. ? section 6.10.6.1, main setup options pci function 0 (word 0x30) . description of bit 5 updated to ?ibd: iscsi boot disable.? ? section 6.10.6.7, iscsi option rom version (word 0x36) . description of word 0x36 added. describes option rom versions. ? section 6.2.18, pcie control (word 0x1b) . decription of bit 12 updated to ?lane reversal disable?. ? section 7.10.3.6.2, replication mode disabled - the following list item was deleted: ?3. multicast or broadcast - if the packet is a multicast or broadcast packet and was not forwarded in step 1 and 2, set the default pool bit in the pool list (from vt_ctl.def_pl).? ? section 7.10.3.4, size filtering . this section added. ? section 10.5.10.1.6, update macsec parameters . table rows in the section updated. see: ? initialize macsec rx ? initialize macsec tx ? set macsec tx key ? enable macsec ? section 11.4.2.2, digital i/o . table notes have been corrected in the table that resides in the section. two notes weren?t referenced in the table correctly. ? appendix a. changes from the 82575 . appendix added (to datasheet). 2.48 6/14/2011 ? nc-si identified as type c.. ? section 7.2.5.3, sctp crc offloading . this note added to section: the crc field of the sctp header must be set to zero prior to requesting a crc calculation offload. ? section 8.17.23, time sync rx configuration - tsyncrxcfg (0x05f50; rw) . the trnsspc description column was updated. ? linksec references corrected; to macsec. 2.49 8/11/2010 ? table 2-8 ; jtag reset input (ac5) described. ? section 6.10.5, pba number module (word 0x08, 0x09) . pba format updated. ? section 7.1.1.2, rx queuing in a virtualized environment . corrected. 2.50 9/14/2010 ? table 2-9, reserved pins and no-connects . table corrected. ? section 6.10.5, pba number module (word 0x08, 0x09) . language of section updated to address issues. ? section 8.8.7, interrupt cause read register - icr (0x01500; rc/w1c) . table was updated. see icr.mddet [bit 28]. ? table 11-14, nc-si ac specifications . table corrected. revision date comments
revisions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 7 2.6 11/5/2010 ? on title page, in feature table, under additional product features: bullet updated to ?memories parity or ecc protection?. ? chapter 6.0, non-volatile memory map - eeprom . chapter now includes example settings for sample eeprom and makes hardware settings clear. ? section 7.2.2.3.11, paylen (18) . note text updated. ? section 8.12.14, tx descriptor completion write?back address low - tdwbal (0x0e038 + 0x40*n [n=0...15]; r/w) . description clarified; see bits 32:2. 2.61 12/10/2010 ? indicated hardware defaults in chapter 6.0, non-volatile memory map - eeprom . added loaded values for 82576_dev_start_no_mgmt_copper_a1 image, where applicable. 2.62 5/5/2011 ? section 1.0, introduction . simple block diagram of part added. ? section 3.5.6.1, general and section 3.5.6.2, mac loopback . information added on mac loopback. not used on this device. ? section 6.10.2, oem specific (word 0x04) . definition updated. ? section 6.10.6.1, main setup options pci function 0 (word 0x30) . word updated. see bits 5, 2-0. ? section 7.1.1.5, l3/l4 5-tuple filters . note added to clarify the filtering of fragmented packets. ? section 7.1.2.1.1, unicast filter . error corrected. there are 24 host unicast addresses, not 16 as previously stated. ? section 9.5.5.12, device control 2 register (0xc8; rw) . note added. expresses write limitation. ? section 11-11, external clock oscillator connectivity to the 82576. figure corrected (font problem). 2.63 12/9/2011 ? figure 11-5 . random line removed from drawing. ? section 3.5.8.2.1, transition to serdes/sgmii mode . procedure updated. ? section 6.10.1, compatibility (word 0x03) . bit 14, serdes forced mode enable, description added. ? section 6.8.7, nc-si configuration (offset 0x6) . updated. ? section 9.4.11.1, 32-bit mapping , section 9.4.11.2, 64-bit mapping without i/o bar , section 9.4.11.3, 64-bit mapping without flash bar ; prefetch memory, bit 3 description update. new text: ?this bit should be set only on systems that do not generate prefetchable cycles.? revision date comments
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 8 contents 1.0 introduction .............................................................................................................................. 4 3 1.1 scope ....................................................................................................................... ............... 44 1.2 terminology and acronyms .................................................................................................... ..... 44 1.2.1 external specification and documents .................................................................................... 46 1.2.1.1 network interface documents......................................................................................... 46 1.2.1.2 host interface documents .............................................................................................. 47 1.2.1.3 virtualization documents ............................................................................................... 4 7 1.2.1.4 networking protocol documents ...................................................................................... 47 1.2.1.5 manageability documents ............................................................................................... 47 1.2.1.6 security documents ...................................................................................................... 47 1.2.2 intel application notes ................................................................................................... ...... 47 1.2.3 reference schematics ...................................................................................................... .... 47 1.2.4 checklists................................................................................................................ ........... 48 1.3 product overview ............................................................................................................ .......... 48 1.3.1 system configurations ..................................................................................................... .... 48 1.4 external interface.......................................................................................................... ............ 48 1.4.1 pcie* interface ........................................................................................................... ........ 48 1.4.2 network interfaces ........................................................................................................ ...... 48 1.4.3 eeprom interface .......................................................................................................... ..... 49 1.4.4 serial flash interface .................................................................................................... ....... 49 1.4.5 smbus interface........................................................................................................... ....... 49 1.4.6 nc-si interface........................................................................................................... ........ 49 1.4.7 mdio/2 wires interfaces................................................................................................... .... 49 1.4.8 software-definable pins (sdp) interface (general-purpose i/o) ................................................. 50 1.4.9 leds interface ............................................................................................................ ........ 50 1.5 comparing product features .................................................................................................. ..... 50 1.6 overview of new capabilities ................................................................................................ ...... 54 1.6.1 ipsec off load for flows .................................................................................................. ..... 54 1.6.2 security .................................................................................................................. ........... 55 1.6.3 transmit rate limiting (trl) .............................................................................................. .. 55 1.6.4 performance ............................................................................................................... ........ 55 1.6.4.1 tx descriptor write-back ............................................................................................... 5 5 1.6.5 rx and tx queues .......................................................................................................... ..... 55 1.6.6 interrupts ................................................................................................................ .......... 55 1.6.7 virtualization ............................................................................................................ .......... 56 1.6.7.1 pci sr iov .............................................................................................................. .... 56 1.6.7.2 packets classification.................................................................................................. ... 56 1.6.7.3 hardware virtualization................................................................................................. .56 1.6.7.4 bandwidth allocation .................................................................................................... .57 1.6.8 vpd....................................................................................................................... ............ 57 1.6.9 64 bit bars support ....................................................................................................... ...... 57 1.6.10 ieee 1588 - precision time protocol (ptp) .............................................................................. 57 1.7 device data flows ........................................................................................................... .......... 57 1.7.1 transmit data flow ........................................................................................................ ..... 57 1.7.2 receive data flow ......................................................................................................... ...... 58 2.0 pin interface ............................................................................................................................. 61 2.1 pin assignment .............................................................................................................. ........... 61 2.1.1 pcie ..................................................................................................................... ............ 61 2.1.2 flash and eeprom ports (8) ................................................................................................ .62 2.1.3 system management bus (smb) interface ............................................................................. 63 2.1.4 nc-si interface pins ..................................................................................................... ...... 63 2.1.5 miscellaneous pins ....................................................................................................... ....... 64 2.1.6 serdes/sgmii pins ........................................................................................................ .... 64 2.1.7 sfp pins ................................................................................................................. ........... 65 2.1.8 media dependent interface (phy?s mdi) pins........................................................................... 65 2.1.8.1 led?s (8) ............................................................................................................... ...... 65 2.1.8.2 analog pins ............................................................................................................ ..... 66
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 9 2.1.9 testability pins ......................................................................................................... .......... 66 2.1.10 reserved pins and no-connects ............................................................................................ 66 2.1.11 power supply pins ........................................................................................................ ....... 68 2.2 pull-ups/pull-downs ......................................................................................................... .......... 68 2.3 strapping ................................................................................................................... .............. 71 2.4 interface diagram ........................................................................................................... .......... 72 2.5 pin list (alphabetical) ..................................................................................................... ........... 73 2.6 ball out.................................................................................................................... ................ 75 3.0 interconnects ............................................................................................................................ 77 3.1 pcie ........................................................................................................................ ................ 77 3.1.1 pcie overview ............................................................................................................. ....... 77 3.1.1.1 architecture, transaction and link layer properties ........................................................... 78 3.1.1.2 physical interface properties........................................................................................... 79 3.1.1.3 advanced extensions ..................................................................................................... 79 3.1.2 functionality - general ................................................................................................... ...... 79 3.1.2.1 native/legacy ........................................................................................................... ... 79 3.1.2.2 locked transactions ..................................................................................................... .79 3.1.2.3 end to end crc (ecrc) ................................................................................................. 79 3.1.3 host i/f .................................................................................................................. ........... 80 3.1.3.1 tag ids ................................................................................................................. ...... 80 3.1.3.1.1 tag id allocation for read transactions........................................................................ 80 3.1.3.1.2 tag id allocation for write transactions ....................................................................... 80 3.1.3.1.2.1 case 1 - dca disabled in the system: .................................................................... 81 3.1.3.1.2.2 case 2 - dca enabled in the system, but disabled for the request: ........................... 81 3.1.3.1.2.3 case 3 - dca enabled in the system, dca enabled for the request:........................... 81 3.1.3.2 completion timeout mechanism ...................................................................................... 81 3.1.3.2.1 completion timeout enable ......................................................................................... 82 3.1.3.2.2 resend request enable............................................................................................... 82 3.1.3.2.3 completion timeout period.......................................................................................... 83 3.1.4 transaction layer......................................................................................................... ....... 84 3.1.4.1 transaction types accepted by the 82576 ........................................................................ 84 3.1.4.1.1 configuration request retry status .............................................................................. 85 3.1.4.1.2 partial memory read and write requests ...................................................................... 85 3.1.4.2 transaction types initiated by the 82576 ......................................................................... 85 3.1.4.2.1 data alignment........................................................................................................ .. 85 3.1.4.2.2 multiple tx data read requests ................................................................................... 86 3.1.4.3 messages................................................................................................................ ..... 86 3.1.4.3.1 message handling by the 82576 (as a receiver)............................................................. 86 3.1.4.3.2 message handling by the 82576 (as a transmitter) ........................................................ 87 3.1.4.4 ordering rules .......................................................................................................... ... 87 3.1.4.4.1 out of order completion handling ................................................................................ 88 3.1.4.5 transaction definition and attributes ............................................................................... 88 3.1.4.5.1 max payload size ...................................................................................................... .88 3.1.4.5.2 traffic class (tc) and virtual channels (vc) .................................................................. 88 3.1.4.5.3 relaxed ordering ...................................................................................................... .88 3.1.4.5.4 snoop not required ................................................................................................... 8 9 3.1.4.5.5 no snoop and relaxed ordering for lan traffic .............................................................. 89 3.1.4.5.5.1 no-snoop option for payload ................................................................................ 90 3.1.4.5.5.2 no snoop option for tso header ........................................................................... 90 3.1.4.6 flow control............................................................................................................ ..... 90 3.1.4.6.1 82576 flow control rules............................................................................................ 90 3.1.4.6.2 upstream flow control tracking................................................................................... 91 3.1.4.6.3 flow control update frequency.................................................................................... 91 3.1.4.6.4 flow control timeout mechanism ................................................................................. 91 3.1.4.7 error forwarding........................................................................................................ ... 91 3.1.5 data link layer........................................................................................................... ........ 91 3.1.5.1 ack/nak scheme ......................................................................................................... 9 1 3.1.5.2 supported dllps ......................................................................................................... .92 3.1.5.3 transmit edb nullifying ................................................................................................. 93 3.1.6 physical layer............................................................................................................ ......... 93
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 10 3.1.6.1 link width .............................................................................................................. ..... 93 3.1.6.2 polarity inversion ...................................................................................................... .... 93 3.1.6.3 l0s exit latency ........................................................................................................ .... 93 3.1.6.4 lane-to-lane de-skew .................................................................................................. 93 3.1.6.5 lane reversal........................................................................................................... .... 94 3.1.6.6 reset ................................................................................................................... ....... 94 3.1.6.7 scrambler disable ....................................................................................................... .. 95 3.1.7 error events and error reporting .......................................................................................... .95 3.1.7.1 mechanism in general.................................................................................................... 95 3.1.7.2 error events ............................................................................................................ ..... 96 3.1.7.3 error pollution ......................................................................................................... ..... 98 3.1.7.4 completion with unsuccessful completion status............................................................... 98 3.1.7.5 error reporting changes ................................................................................................ 9 8 3.1.8 performance monitoring .................................................................................................... ... 99 3.1.8.1 leaky bucket mode ....................................................................................................... 99 3.1.9 pcie power management .................................................................................................... 1 00 3.1.10 pcie programming interface ............................................................................................... 100 3.2 management interfaces ....................................................................................................... ..... 100 3.2.1 smbus ..................................................................................................................... ........ 100 3.2.1.1 channel behavior ........................................................................................................ .100 3.2.1.1.1 smbus addressing.....................................................................................................1 00 3.2.1.1.2 smbus notification methods........................................................................................101 3.2.1.1.2.1 smbus alert and alert response method ................................................................101 3.2.1.1.2.2 asynchronous notify method ................................................................................102 3.2.1.1.2.3 direct receive method .........................................................................................103 3.2.1.1.3 receive tco flow .....................................................................................................1 03 3.2.1.1.4 transmit tco flow ....................................................................................................1 04 3.2.1.1.5 transmit errors in sequence handling..........................................................................104 3.2.1.1.6 tco command aborted flow ......................................................................................105 3.2.1.1.7 concurrent smbus transactions ..................................................................................105 3.2.1.1.8 smbus arp functionality ............................................................................................105 3.2.1.1.8.1 smbus arp in dual-/single-address mode ..............................................................106 3.2.1.1.8.2 smbus arp flow .................................................................................................106 3.2.1.1.8.3 smbus arp udid content ....................................................................................107 3.2.1.1.9 lan fail-over through smbus ....................................................................................109 3.2.2 nc-si ..................................................................................................................... ......... 109 3.2.2.1 electrical characteristics .............................................................................................. .109 3.2.2.2 nc-si transactions ...................................................................................................... 110 3.3 flash / eeprom.............................................................................................................. ......... 110 3.3.1 eeprom interface .......................................................................................................... ... 110 3.3.1.1 general overview........................................................................................................ .110 3.3.1.2 eeprom device ........................................................................................................... 111 3.3.1.3 software accesses ....................................................................................................... 111 3.3.1.4 signature field ......................................................................................................... ...112 3.3.1.5 protected eeprom space ..............................................................................................112 3.3.1.5.1 initial eeprom programming ......................................................................................112 3.3.1.5.2 activating the protection mechanism............................................................................112 3.3.1.5.3 non permitted accessing to protected areas in the eeprom ............................................112 3.3.1.6 eeprom recovery ........................................................................................................1 13 3.3.1.7 eeprom-less support ..................................................................................................113 3.3.1.7.1 access to the eeprom controlled feature.....................................................................114 3.3.2 shared eeprom ............................................................................................................. ... 115 3.3.2.1 eeprom deadlock avoidance .........................................................................................115 3.3.2.2 eeprom map shared words ..........................................................................................115 3.3.3 vital product data (vpd) support ........................................................................................ 11 6 3.3.4 flash interface........................................................................................................... ....... 117 3.3.4.1 flash interface operation ..............................................................................................1 17 3.3.4.2 flash write control..................................................................................................... ..118 3.3.4.3 flash erase control ..................................................................................................... .118 3.3.5 shared flash.............................................................................................................. ..... 119 3.3.5.1 flash access contention................................................................................................1 19
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 11 3.3.5.2 flash deadlock avoidance ............................................................................................. 119 3.4 configurable i/o pins ....................................................................................................... ........ 120 3.4.1 general-purpose i/o (software-definable pins) ......................................................................120 3.4.2 software watchdog ......................................................................................................... ...120 3.4.2.1 watchdog re-arm ........................................................................................................ 1 21 3.4.3 leds ...................................................................................................................... ..........121 3.5 network interfaces .......................................................................................................... ........ 121 3.5.1 overview .................................................................................................................. ........121 3.5.2 mac functionality......................................................................................................... ......122 3.5.2.1 internal gmii/mii interface ........................................................................................... 12 2 3.5.2.2 mdio/mdc................................................................................................................ .. 122 3.5.2.2.1 mdic register usage................................................................................................. 12 3 3.5.2.3 duplex operation with copper phy ................................................................................. 124 3.5.2.3.1 full duplex........................................................................................................... .... 124 3.5.2.3.2 half duplex ........................................................................................................... ... 124 3.5.3 serdes, sgmii support ..................................................................................................... ..125 3.5.3.1 serdes analog block .................................................................................................... 1 25 3.5.3.2 serdes/sgmii pcs block .............................................................................................. 125 3.5.3.3 gbe physical coding sub-layer (pcs) ............................................................................. 125 3.5.3.3.1 8b10b encoding/decoding ......................................................................................... 126 3.5.3.3.2 code groups and ordered sets ................................................................................... 126 3.5.4 auto-negotiation and link setup features .............................................................................127 3.5.4.1 serdes link configuration ............................................................................................. 12 7 3.5.4.1.1 signal detect indication ............................................................................................. 1 27 3.5.4.1.2 mac link speed........................................................................................................ 127 3.5.4.1.3 serdes mode auto-negotiation ................................................................................... 128 3.5.4.1.4 forcing link .......................................................................................................... ... 129 3.5.4.1.5 hw detection of non-auto-negotiation partner ............................................................. 129 3.5.4.2 sgmii link configuration .............................................................................................. 12 9 3.5.4.2.1 sgmii auto-negotiation ............................................................................................. 129 3.5.4.2.2 forcing link .......................................................................................................... ... 130 3.5.4.2.3 mac speed resolution ............................................................................................... 130 3.5.4.3 copper phy link configuration....................................................................................... 130 3.5.4.3.1 phy auto-negotiation (speed, duplex, flow control) .....................................................130 3.5.4.3.2 mac speed resolution ............................................................................................... 131 3.5.4.3.2.1 forcing mac speed ............................................................................................. 131 3.5.4.3.2.2 using internal phy direct link-speed indication .....................................................131 3.5.4.3.3 mac full-/half- duplex resolution ............................................................................... 132 3.5.4.3.4 using phy registers .................................................................................................. 1 32 3.5.4.3.5 comments regarding forcing link............................................................................... 132 3.5.4.4 loss of signal/link status indication .............................................................................. 132 3.5.5 ethernet flow control (fc) ................................................................................................ ..133 3.5.5.1 mac control frames and receiving flow control packets ...................................................133 3.5.5.1.1 structure of 802.3x fc packets................................................................................... 133 3.5.5.1.2 operation and rules .................................................................................................. 1 34 3.5.5.1.3 timing considerations ............................................................................................... 13 5 3.5.5.2 pause and mac control frames forwarding .................................................................... 135 3.5.5.3 transmission of pause frames ...................................................................................... 135 3.5.5.3.1 operation and rules .................................................................................................. 1 36 3.5.5.3.2 software initiated pause frame transmission .............................................................. 136 3.5.5.4 ipg control and pacing ................................................................................................. 1 37 3.5.5.4.1 fixed ipg extension .................................................................................................. 1 37 3.5.5.4.2 limiting payload rate ................................................................................................ 1 37 3.5.6 loopback support .......................................................................................................... ....137 3.5.6.1 general ................................................................................................................. ..... 137 3.5.6.2 mac loopback ............................................................................................................ . 138 3.5.6.3 internal phy loopback.................................................................................................. 1 38 3.5.6.3.1 setting the 82576 to phy loopback mode ..................................................................... 138 3.5.6.4 serdes loopback ......................................................................................................... 139 3.5.6.4.1 setting serdes loopback mode.................................................................................... 139 3.5.6.5 external phy loopback ................................................................................................. 13 9
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 12 3.5.6.5.1 setting the 82576 to external phy loopback mode .........................................................139 3.5.7 integrated copper phy functionality .................................................................................... 140 3.5.7.1 phy initialization functionality .......................................................................................1 40 3.5.7.1.1 auto mdio register initialization.................................................................................140 3.5.7.1.2 general register initialization .....................................................................................14 0 3.5.7.1.3 mirror bit initialization ............................................................................................. ..141 3.5.7.2 determining link state .................................................................................................1 41 3.5.7.2.1 false link ............................................................................................................ ....142 3.5.7.2.2 forced operation ...................................................................................................... 143 3.5.7.2.3 auto negotiation ...................................................................................................... .143 3.5.7.2.4 parallel detection .................................................................................................... ..143 3.5.7.2.5 auto cross-over ....................................................................................................... 144 3.5.7.2.6 10/100 mb/s mismatch resolution ...............................................................................144 3.5.7.2.7 link criteria ......................................................................................................... ....145 3.5.7.2.7.1 1000base-t ......................................................................................................145 3.5.7.2.7.2 100base-tx ......................................................................................................145 3.5.7.2.7.3 10base-t ..........................................................................................................14 5 3.5.7.3 link enhancements ......................................................................................................1 45 3.5.7.3.1 smartspeed ............................................................................................................ .146 3.5.7.3.1.1 using smartspeed ..............................................................................................146 3.5.7.4 flow control ............................................................................................................ ....146 3.5.7.5 management data interface ..........................................................................................147 3.5.7.6 low power operation and power management .................................................................147 3.5.7.6.1 power down via the phy register ................................................................................147 3.5.7.6.2 power management state...........................................................................................147 3.5.7.6.3 an1000_dis ............................................................................................................ .147 3.5.7.6.4 low power link up - link speed control.......................................................................148 3.5.7.6.4.1 d0a state ..........................................................................................................1 49 3.5.7.6.4.2 non-d0a state ...................................................................................................149 3.5.7.6.5 smart power-down (spd) ..........................................................................................149 3.5.7.6.5.1 back-to-back smart power-down .........................................................................150 3.5.7.6.6 link energy detect .................................................................................................... 150 3.5.7.6.7 phy power-down state ..............................................................................................150 3.5.7.7 advanced diagnostics ...................................................................................................1 51 3.5.7.7.1 tdr - time domain reflectometry...............................................................................151 3.5.7.7.2 channel frequency response .....................................................................................151 3.5.7.8 1000 mb/s operation ....................................................................................................1 51 3.5.7.8.1 introduction .......................................................................................................... ...151 3.5.7.8.2 transmit functions.................................................................................................... 152 3.5.7.8.2.1 scrambler..........................................................................................................1 52 3.5.7.8.2.2 transmit fifo ....................................................................................................153 3.5.7.8.2.3 transmit phase-locked loop pll ..........................................................................153 3.5.7.8.2.4 trellis encoder ...................................................................................................15 3 3.5.7.8.2.5 4dpam5 encoder ................................................................................................153 3.5.7.8.2.6 spectral shaper..................................................................................................153 3.5.7.8.2.7 low-pass filter ...................................................................................................15 4 3.5.7.8.2.8 line driver......................................................................................................... 154 3.5.7.8.3 receive functions ..................................................................................................... 154 3.5.7.8.3.1 hybrid.............................................................................................................. .155 3.5.7.8.3.2 automatic gain control (agc) ..............................................................................155 3.5.7.8.3.3 timing recovery.................................................................................................155 3.5.7.8.3.4 analog-to-digital converter (adc) ........................................................................155 3.5.7.8.3.5 digital signal processor (dsp) ..............................................................................155 3.5.7.8.3.6 de scrambler .....................................................................................................155 3.5.7.8.3.7 viterbi decoder/decision feedback equalizer (dfe) .................................................155 3.5.7.8.3.8 4dpam5 decoder ................................................................................................156 3.5.7.8.3.9 100 mb/s operation ............................................................................................156 3.5.7.8.3.10 10 mb/s operation ..............................................................................................156 3.5.7.8.3.11 link test .......................................................................................................... .156 3.5.7.8.3.12 10base-t link failure criteria and override ............................................................156 3.5.7.8.3.13 jabber............................................................................................................. ..156
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 13 3.5.7.8.3.14 polarity correction .............................................................................................. 15 6 3.5.7.8.3.15 dribble bits....................................................................................................... . 157 3.5.7.8.3.16 phy address ...................................................................................................... 15 7 3.5.8 media auto sense.......................................................................................................... .....157 3.5.8.1 auto sense setup ........................................................................................................ 157 3.5.8.1.1 serdes/sgmii detect mode (phy is active) ................................................................... 157 3.5.8.1.2 phy detect mode (serdes/sgmii is active) ................................................................... 158 3.5.8.2 switching between media .............................................................................................. 158 3.5.8.2.1 transition to serdes/sgmii mode................................................................................ 158 3.5.8.2.2 transition to internal phy mode .................................................................................. 159 4.0 initialization ............................................................................................................................161 4.1 power up .................................................................................................................... ........... 161 4.1.1 power-up sequence......................................................................................................... ...161 4.1.2 power-up timing diagram ................................................................................................... 162 4.1.2.1 timing requirements.................................................................................................... 1 63 4.1.2.2 timing guarantees ....................................................................................................... 163 4.2 reset operation ............................................................................................................. ......... 163 4.2.1 reset sources............................................................................................................. .......163 4.2.1.1 internal_power_on_reset ............................................................................................. 164 4.2.1.2 pe_rst_n................................................................................................................ ... 164 4.2.1.3 in-band pcie reset ...................................................................................................... 164 4.2.1.4 d3hot to d0 transition ................................................................................................. 1 64 4.2.1.5 function level reset (flr) ............................................................................................ 16 4 4.2.1.5.1 pf (physical function) flr or flr in non-iov mode .......................................................164 4.2.1.5.2 vf (virtual function) flr (function level reset) ........................................................... 164 4.2.1.5.3 iov (io virtualization) disable .................................................................................... 164 4.2.1.6 software reset .......................................................................................................... .. 165 4.2.1.6.1 full software reset ................................................................................................... 165 4.2.1.6.2 physical function (pf) software reset.......................................................................... 165 4.2.1.6.3 vf software reset..................................................................................................... 165 4.2.1.7 force tco............................................................................................................... .... 166 4.2.1.8 firmware reset .......................................................................................................... . 166 4.2.1.9 eeprom reset ............................................................................................................ . 166 4.2.1.10 phy reset .............................................................................................................. ..... 166 4.2.2 reset effects ............................................................................................................. ........167 4.2.3 phy behavior during a manageability session ........................................................................173 4.3 function disable............................................................................................................ .......... 174 4.3.1 general................................................................................................................... ..........174 4.3.2 overview .................................................................................................................. ........174 4.3.3 control options ........................................................................................................... .......176 4.3.3.1 pci functions disable options ........................................................................................ 176 4.3.4 event flow for enable/disable functions ................................................................................176 4.3.4.1 multi-function advertisement ........................................................................................ 177 4.3.4.2 legacy interrupts utilization .......................................................................................... 1 77 4.3.4.3 power reporting ......................................................................................................... . 177 4.4 device disable .............................................................................................................. .......... 177 4.4.1 bios handling of device disable ..........................................................................................1 78 4.5 software initialization and diagnostics .................................................................................... .. 178 4.5.1 introduction .............................................................................................................. ........178 4.5.2 power up state ............................................................................................................ ......178 4.5.3 initialization sequence ................................................................................................... .....179 4.5.4 interrupts during initialization .......................................................................................... ...179 4.5.5 global reset and general configuration.................................................................................179 4.5.6 flow control setup ........................................................................................................ .....180 4.5.7 link setup mechanisms and control/status bit summary.........................................................180 4.5.7.1 phy initialization...................................................................................................... .... 180 4.5.7.2 mac/phy link setup (ctrl_ext.link_mode = 00).......................................................... 180 4.5.7.2.1 mac settings automatically based on duplex and speed resolved by phy (ctrl.frcdplx = 0b, ctrl.frcspd = 0b,) ..........................................180
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 14 4.5.7.2.2 mac duplex and speed settings forced by software based on resolution of phy (ctrl.frcdplx = 1b, ctrl.frcspd = 1b) ..........................................180 4.5.7.2.3 mac/phy duplex and speed settings both forced by software (fully-forced link setup) (ctrl.frcdplx = 1b, ctrl.frcspd = 1b, ctrl.slu = 1b) ...................................................................................................181 4.5.7.3 mac/serdes link setup (ctrl_ext.link_mode = 11b)......................................................................................181 4.5.7.3.1 hardware auto-negotiation enabled (pcs_lctl. an enable = 1b; ctrl.frcspd = 0b; ctrl.frcdplx = 0) ......................................................................181 4.5.7.3.2 auto-negotiation skipped (pcs_lctl. an enable = 0b; ctrl.frcspd = 1b; ctrl.frcdplx = 1) ......................................................................182 4.5.7.4 mac/sgmii link setup (ctrl_ext.link_mode = 10b).....................................................182 4.5.7.4.1 hardware auto-negotiation enabled (pcs_lctl. an enable = 1b, ctrl.frcdplx = 0b, ctrl.frcspd = 0b) ....................................................................182 4.5.8 initialization of statistics .............................................................................................. ...... 183 4.5.9 receive initialization .................................................................................................... ...... 183 4.5.9.1 initialize the receive control register .............................................................................184 4.5.9.2 dynamic enabling and disabling of receive queues ..........................................................184 4.5.10 transmit initialization .................................................................................................. ...... 184 4.5.10.1 dynamic queue enabling and disabling...........................................................................185 4.5.11 virtualization initialization flow ....................................................................................... .... 185 4.5.11.1 next generation vmdq mode .........................................................................................185 4.5.11.1.1 global filtering and offload capabilities........................................................................185 4.5.11.1.2 mirroring rules. ..................................................................................................... ...186 4.5.11.1.3 per pool settings.................................................................................................... ...186 4.5.11.1.4 security features .................................................................................................... ..187 4.5.11.1.4.1 anti spoofing...................................................................................................... 187 4.5.11.1.4.2 storm control.....................................................................................................1 87 4.5.11.1.5 allocation of tx bandwidth to vms ...............................................................................187 4.5.11.1.5.1 configuring tx bandwidth to vms..........................................................................187 4.5.11.1.5.2 link speed change procedure ..............................................................................188 4.5.11.2 iov initialization ..................................................................................................... .....188 4.5.11.2.1 pf driver initialization ............................................................................................. ..188 4.5.11.2.1.1 vf specific reset coordination..............................................................................189 4.5.11.2.2 vf driver initialization ............................................................................................. ..189 4.5.11.2.3 full reset coordination .............................................................................................. 189 4.5.11.2.4 iov disable .......................................................................................................... ....190 4.5.11.2.5 vfre/vfte ............................................................................................................ ...190 4.5.12 transmit rate limiting configuration ................................................................................... 19 0 4.5.12.1 link speed change procedure........................................................................................190 4.5.12.2 configuration flow ..................................................................................................... ..190 4.5.12.3 configuration rules .................................................................................................... ..191 4.6 access to shared resources .................................................................................................. ..... 191 4.6.1 acquiring ownership over a shared resource.......................................................................... 191 4.6.2 releasing ownership over a shared resource ......................................................................... 193 5.0 power management ................................................................................................................. 195 5.1 general power state information ............................................................................................. .. 195 5.1.1 pci device power states ................................................................................................... . 195 5.1.2 pcie link power states .................................................................................................... .. 196 5.1.3 pcie link power states .................................................................................................... .. 196 5.2 82576 power states .......................................................................................................... ....... 196 5.2.1 d0 uninitialized state (d0u) .............................................................................................. . 197 5.2.1.1 entry into d0u state .................................................................................................... .197 5.2.1.2 exit from d0u state ..................................................................................................... .197 5.2.2 d0active state ............................................................................................................ ...... 198 5.2.2.1 entry to d0a state...................................................................................................... ..198 5.2.3 d3 state (pci-pm d3hot) ................................................................................................... 198 5.2.3.1 entry to d3 state ....................................................................................................... ..198 5.2.3.2 exit from d3 state ...................................................................................................... .199 5.2.3.3 master disable via ctrl register ...................................................................................199
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 15 5.2.4 dr state (d3cold) ......................................................................................................... ......200 5.2.4.1 dr disable mode ......................................................................................................... . 200 5.2.4.2 entry to dr state ....................................................................................................... .. 201 5.2.4.3 auxiliary power usage .................................................................................................. 2 01 5.2.5 link disconnect........................................................................................................... .......201 5.2.6 device power-down state ................................................................................................... 202 5.3 power limits by certain form factors........................................................................................ . 202 5.4 interconnects power management ............................................................................................. 2 02 5.4.1 pcie link power management ..............................................................................................20 2 5.4.2 nc-si clock control....................................................................................................... .....204 5.4.3 phy power-management .....................................................................................................2 04 5.4.4 serdes/sgmii power management .......................................................................................204 5.5 timing of power-state transitions........................................................................................... ... 205 5.5.1 power up (off to dup to d0u to d0a .....................................................................................205 5.5.2 transition from d0a to d3 and back without pe_rst_n ..........................................................206 5.5.3 transition from d0a to d3 and back with pe_rst_n ..............................................................207 5.5.4 transition from d0a to dr and back without transition to d3...................................................209 5.6 wake up ..................................................................................................................... ........... 210 5.6.1 advanced power management wake up ................................................................................210 5.6.2 pcie power management wake up .......................................................................................211 5.6.3 wake-up packets ........................................................................................................... ....212 5.6.3.1 pre-defined filters ..................................................................................................... .. 212 5.6.3.1.1 directed exact packet ................................................................................................ 2 12 5.6.3.1.2 directed multicast packet ........................................................................................... 21 2 5.6.3.1.3 broadcast ............................................................................................................. ... 212 5.6.3.1.4 magic packet .......................................................................................................... .. 213 5.6.3.1.5 arp/ipv4 request packet ........................................................................................... 214 5.6.3.1.6 directed ipv4 packet ................................................................................................. 2 15 5.6.3.1.7 directed ipv6 packet ................................................................................................. 2 16 5.6.3.2 flexible filters ........................................................................................................ ..... 216 5.6.3.2.1 ipx diagnostic responder request packet .................................................................... 217 5.6.3.2.2 directed ipx packet................................................................................................... 217 5.6.3.2.3 ipv6 neighbor discovery filter .................................................................................... 218 5.6.3.3 wake up packet storage ............................................................................................... 218 6.0 non-volatile memory map - eeprom ........................................................................................219 6.1 eeprom general map.......................................................................................................... ..... 219 6.2 hardware accessed words ..................................................................................................... ... 221 6.2.1 ethernet address (words 0x00:02) .......................................................................................221 6.2.2 initialization control word 1 (word 0x0a)..............................................................................222 6.2.3 subsystem id (word 0x0b) .................................................................................................2 23 6.2.4 subsystem vendor id (word 0x0c) ......................................................................................223 6.2.5 device id (word 0x0d, 0x11) ..............................................................................................2 23 6.2.6 dummy device id (word 0x1d) ...........................................................................................223 6.2.7 initialization control word 2 lan1 (word 0x0f) ......................................................................223 6.2.8 software defined pins control lan1 (word 0x10) ...................................................................224 6.2.9 software defined pins control lan0 (word 0x20) ...................................................................226 6.2.10 eeprom sizing and protected fields (word 0x12) ...................................................................227 6.2.11 reserved (word 0x13) ..................................................................................................... ...228 6.2.12 initialization control 3 (word 0x14, 0x24) .............................................................................22 9 6.2.13 pcie completion timeout configuration (word 0x15) ..............................................................231 6.2.14 msi-x configuration (word 0x16).........................................................................................2 31 6.2.15 pcie init configuration 1 word (word 0x18) ..........................................................................231 6.2.16 pcie init configuration 2 word (word 0x19) ..........................................................................232 6.2.17 pcie init configuration 3 word (word 0x1a) ..........................................................................232 6.2.18 pcie control (word 0x1b) ................................................................................................. ..233 6.2.19 led 1,3 configuration defaults (word 0x1c, 0x2a) .................................................................234 6.2.20 device rev id (word 0x1e) ................................................................................................ .236 6.2.21 led 0,2 configuration defaults (word 0x1f, 0x2b) .................................................................236 6.2.22 functions control (word 0x21)............................................................................................ .238 6.2.23 lan power consumption (word 0x22) ...................................................................................239
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 16 6.2.24 i/o virtualization (iov) control (word 0x25)......................................................................... 239 6.2.25 iov device id (word 0x26) ................................................................................................ 240 6.2.26 end of read-only (ro) area (word 0x2c)............................................................................. 240 6.2.27 start of ro area (word 0x2d)............................................................................................. 240 6.2.28 watchdog configuration (word 0x2e)................................................................................... 240 6.2.29 vpd pointer (word 0x2f).................................................................................................. .. 240 6.2.30 nc-si arbitration enable (word 0x40).................................................................................. 241 6.3 analog blocks configuration structures...................................................................................... . 241 6.3.1 analog configuration pointers start address (offset 0x17) ...................................................... 241 6.3.2 pcie initialization pointer (offset 0, relative to word 0x17 value)............................................ 241 6.3.3 phy initialization pointer (offset 1, relative to word 0x17 value) ............................................ 242 6.3.4 serdes initialization pointer (offset 2, relative to word 0x17 value) ........................................ 242 6.4 serdes/phy/pcie/pll/ccm initialization structures ...................................................................... 242 6.4.1 block header (offset 0x0) ................................................................................................. . 242 6.4.2 crc8 (offset 1) ........................................................................................................... ..... 243 6.4.3 next buffer pointer (offset 2 - optional) ............................................................................... 24 3 6.4.4 address/data (offset 3:word count).................................................................................... 243 6.5 firmware pointers & control words ........................................................................................... . 244 6.5.1 loader patch pointer (word 0x51) ....................................................................................... 244 6.5.2 no manageability patch pointer (word 0x52) ......................................................................... 244 6.5.3 manageability capability/manageability enable (word 0x54).................................................... 245 6.5.4 pt patch configuration pointer (word 0x55).......................................................................... 245 6.5.5 pt lan0 configuration pointer (word 0x56) .......................................................................... 245 6.5.6 sideband configuration pointer (word 0x57) ......................................................................... 246 6.5.7 flex tco filter configuration pointer (word 0x58) ................................................................. 246 6.5.8 pt lan1 configuration pointer (word 0x59) .......................................................................... 246 6.5.9 management hw config control (word 0x23)........................................................................ 246 6.6 patch structure ............................................................................................................. .......... 247 6.6.1 patch data size (offset 0x0) .............................................................................................. . 247 6.6.2 block crc8 (offset 0x1)................................................................................................... .. 247 6.6.3 patch entry point pointer low word (offset 0x2) ................................................................... 247 6.6.4 patch entry point pointer high word (offset 0x3)................................................................... 247 6.6.5 patch version 1 word (offset 0x4) ....................................................................................... 24 8 6.6.6 patch version 2 word (offset 0x5) ....................................................................................... 24 8 6.6.7 patch version 3 word (offset 0x6) ....................................................................................... 24 8 6.6.8 patch version 4 word (offset 0x7) ....................................................................................... 24 8 6.6.9 patch data words (offset 0x8, block length) ........................................................................ 248 6.7 pt lan configuration structure .............................................................................................. ... 248 6.7.1 section header (offset 0x0)............................................................................................... . 249 6.7.2 lan0 ipv4 address 0 lsb, mipaf0 (offset 0x01) ................................................................... 249 6.7.3 lan0 ipv4 address 0 msb, mipaf0 (offset 0x02) .................................................................. 249 6.7.4 lan0 ipv4 address 1; mipaf1 (offset 0x03:0x04) ................................................................. 249 6.7.5 lan0 ipv4 address 2; mipaf2 (offset 0x05h:0x06) ............................................................... 249 6.7.6 lan0 ipv4 address 3; mipaf3 (offset 0x07h:0x08) ............................................................... 249 6.7.7 lan0 mac address 0 lsb, mmal0 (offset 0x09) .................................................................... 249 6.7.8 lan0 mac address 0 lsb, mmal0 (offset 0x0a).................................................................... 250 6.7.9 lan0 mac address 0 msb, mmah0 (offset 0x0b) .................................................................. 250 6.7.10 lan0 mac address 1; mmal/h1 (offset 0x0c:0x0e) .............................................................. 250 6.7.11 lan0 mac address 2; mmal/h2 (offset 0x0f:0x11)............................................................... 250 6.7.12 lan0 mac address 3; mmal/h3 (offset 0x12:0x14) .............................................................. 250 6.7.13 lan0 udp flex filter ports 0:15; mfutp registers (offset 0x15:0x24)...................................... 250 6.7.14 lan0 vlan filter 0:7; mavtv registers (offset 0x25:0x2c) .................................................... 251 6.7.15 lan0 manageability filters valid; mfval lsb (offset 0x2d) .................................................... 251 6.7.16 lan0 manageability filters valid; mfval msb (offset 0x2e) .................................................... 251 6.7.17 lan0 manc value lsb (offset 0x2f).................................................................................... 251 6.7.18 lan0 manc value msb (offset 0x30) ................................................................................... 252 6.7.19 lan0 receive enable 1 (offset 0x31) ................................................................................... 252 6.7.20 lan0 receive enable 2 (offset 0x32) ................................................................................... 253 6.7.21 lan0 manc2h value lsb (offset 0x33) ................................................................................ 253 6.7.22 lan0 manc2h value msb (offset 0x34) ............................................................................... 253 6.7.23 manageability decision filters; mdef0,1 (offset 0x35) ........................................................... 253
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 17 6.7.24 manageability decision filters; mdef0,2 (offset 0x36) ............................................................254 6.7.25 manageability decision filters; mdef0,3 (offset 0x37) ............................................................254 6.7.26 manageability decision filters; mdef0,4 (offset 0x38) ............................................................254 6.7.27 manageability decision filters; mdef1:6, 1:4 (offset 0x39:0x50) .............................................255 6.7.28 ethertype data (word 0x.................................................................................................. ...255 6.7.29 ethertype filter; metf0, 1 (offset 0x51) ................................................................................2 55 6.7.30 ethertype filter; metf0, 1 (offset 0x52) ................................................................................2 55 6.7.31 ethertype filter; metf1:3,1:2 (offset 0x53:0x58) ...................................................................255 6.7.32 arp response ipv4 address 0 lsb (offset 0x59) ....................................................................256 6.7.33 arp response ipv4 address 0 msb (offset 0x5a) ...................................................................256 6.7.34 lan0 ipv6 address 0 lsb; mipaf (offset 0x5b)......................................................................256 6.7.35 lan0 ipv6 address 0 msb; mipaf (offset 0x5c).....................................................................256 6.7.36 lan0 ipv6 address 0 lsb; mipaf (offset 0x5d) .....................................................................256 6.7.37 lan0 ipv6 address 0 msb; mipaf (offset 0x5e) .....................................................................256 6.7.38 lan0 ipv6 address 0 lsb; mipaf (offset 0x5f) ......................................................................257 6.7.39 lan0 ipv6 address 0 msb; mipaf (offset 0x60) .....................................................................257 6.7.40 lan0 ipv6 address 0 lsb; mipaf (offset 0x61)......................................................................257 6.7.41 lan0 ipv6 address 0 msb; mipaf (offset 0x62) .....................................................................257 6.7.42 lan0 ipv6 address 1; mipaf (offset 0x63:0x6a)....................................................................257 6.7.43 lan0 ipv6 address 2; mipaf (offset 0x6b:0x72)....................................................................258 6.8 sideband configuration structure ............................................................................................ .. 258 6.8.1 section header (offset 0x0) ............................................................................................... .258 6.8.2 smbus max fragment size (offset 0x1) .................................................................................258 6.8.3 smbus notification timeout and flags (offset 0x2) .................................................................258 6.8.4 smbus slave address (offset 0x3) ........................................................................................25 9 6.8.5 smbus fail-over register; low word (offset 0x4) ..................................................................259 6.8.6 smbus fail-over register; high word (offset 0x5)..................................................................259 6.8.7 nc-si configuration (offset 0x6).......................................................................................... 260 6.8.8 nc-si hardware arbitration configuration (offset 0x8) ............................................................260 6.8.9 reserved (offset 0x9 - 0xc) ............................................................................................... .260 6.9 flex tco filter configuration structure ..................................................................................... .. 260 6.9.1 section header (offset 0x0) ............................................................................................... .260 6.9.2 flex filter length and control (offset 0x01) ...........................................................................261 6.9.3 flex filter enable mask (offset 0x02:0x09) ............................................................................261 6.9.4 flex filter data (offset 0x0a - block length)..........................................................................261 6.10 software accessed words .................................................................................................... ..... 261 6.10.1 compatibility (word 0x03)................................................................................................ ...262 6.10.2 oem specific (word 0x04) ................................................................................................. ..262 6.10.3 oem specific (word 0x06, 0x07) ..........................................................................................2 63 6.10.4 eeprom image revision (word 0x05) ...................................................................................263 6.10.5 pba number module (word 0x08, 0x09)................................................................................263 6.10.6 pxe configuration words (word 0x30:3b) .............................................................................264 6.10.6.1 main setup options pci function 0 (word 0x30) .............................................................. 265 6.10.6.2 configuration customization options pci function 0 (word 0x31).......................................266 6.10.6.3 pxe version (word 0x32) ............................................................................................ 26 8 6.10.6.4 iba capabilities (word 0x33)......................................................................................... 26 8 6.10.6.5 setup options pci function 1 (word 0x34)...................................................................... 269 6.10.6.6 configuration customization options pci function 1 (word 0x35).......................................269 6.10.6.7 iscsi option rom version (word 0x36) .......................................................................... 269 6.10.6.8 setup options pci function 2 (word 0x38)...................................................................... 269 6.10.6.9 configuration customization options pci function 2 (word 0x39).......................................269 6.10.6.10 setup options pci function 3 (word 0x3a)...................................................................... 269 6.10.6.11 configuration customization options pci function 3 (word 0x3b).......................................269 6.10.7 iscsi boot configuration offset (word 0x3d) .........................................................................269 6.10.7.1 iscsi module structure................................................................................................. 269 6.10.8 alternate mac address pointer (word 0x37) ..........................................................................271 6.10.9 checksum word (word 0x3f) ..............................................................................................27 1 6.10.10 image unique id (word 0x42, 0x43) ....................................................................................272 7.0 inline functions .......................................................................................................................273 7.1 receive functionality ....................................................................................................... ........ 273
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 18 7.1.1 rx queues assignment ...................................................................................................... 273 7.1.1.1 queuing in a non-virtualized environment.......................................................................275 7.1.1.2 rx queuing in a virtualized environment .........................................................................276 7.1.1.3 queue configuration registers .......................................................................................279 7.1.1.4 l2 ether-type filters ................................................................................................... .279 7.1.1.5 l3/l4 5-tuple filters ................................................................................................... .280 7.1.1.6 syn packet filters ...................................................................................................... ..281 7.1.1.7 receive-side scaling (rss) ...........................................................................................281 7.1.1.7.1 rss hash function ....................................................................................................2 83 7.1.1.7.1.1 hash for ipv4 with tcp ........................................................................................285 7.1.1.7.1.2 hash for ipv4 with udp .......................................................................................285 7.1.1.7.1.3 hash for ipv4 without tcp ...................................................................................286 7.1.1.7.1.4 hash for ipv6 with tcp ........................................................................................286 7.1.1.7.1.5 hash for ipv6 with udp .......................................................................................286 7.1.1.7.1.6 hash for ipv6 without tcp ...................................................................................286 7.1.1.7.2 indirection table..................................................................................................... ..286 7.1.1.7.3 rss verification suite ................................................................................................ 286 7.1.1.7.3.1 ipv4................................................................................................................ ..287 7.1.1.7.3.2 ipv647 .............................................................................................................. 287 7.1.1.7.4 association through mac address ...............................................................................287 7.1.2 l2 packet filtering ....................................................................................................... ...... 287 7.1.2.1 mac address filtering ................................................................................................... 289 7.1.2.1.1 unicast filter ........................................................................................................ ....290 7.1.2.1.2 multicast filter (partial)............................................................................................ ..291 7.1.2.2 vlan filtering.......................................................................................................... ....291 7.1.2.3 manageability filtering ................................................................................................. .292 7.1.3 receive data storage ...................................................................................................... .. 294 7.1.3.1 host buffers ............................................................................................................ ....294 7.1.3.2 on-chip rx buffers...................................................................................................... .294 7.1.3.3 on-chip descriptor buffers ............................................................................................29 4 7.1.4 legacy receive descriptor format ....................................................................................... 294 7.1.5 advanced receive descriptors ............................................................................................. 2 98 7.1.5.1 advanced receive descriptors (read format) ..................................................................298 7.1.5.2 advanced receive descriptors ? writeback format ..........................................................298 7.1.6 receive descriptor fetching ............................................................................................... . 304 7.1.7 receive descriptor write-back ............................................................................................ 3 04 7.1.8 receive descriptor ring structure........................................................................................ 3 05 7.1.8.1 low receive descriptors threshold .................................................................................306 7.1.9 header splitting and replication .......................................................................................... 307 7.1.9.1 purpose ................................................................................................................. .....307 7.1.9.2 description............................................................................................................. .....307 7.1.10 receive packet checksum off loading.................................................................................. 310 7.1.10.1 filters details........................................................................................................ .......311 7.1.10.1.1 mac address filter ................................................................................................... .311 7.1.10.1.2 snap/vlan filter ..................................................................................................... .311 7.1.10.1.3 ipv4 filter .......................................................................................................... ......312 7.1.10.1.4 ipv6 filter .......................................................................................................... ......312 7.1.10.1.5 ipv6 extension headers .............................................................................................31 2 7.1.10.1.6 udp/tcp filter ....................................................................................................... ...313 7.1.10.2 receive udp fragmentation checksum ...........................................................................314 7.1.11 sctp offload ............................................................................................................. ....... 314 7.2 transmit functionality ...................................................................................................... ........ 315 7.2.1 packet transmission ....................................................................................................... ... 315 7.2.1.1 transmit data storage..................................................................................................3 15 7.2.1.2 on-chip tx buffers...................................................................................................... .315 7.2.1.3 on-chip descriptor buffers ............................................................................................31 5 7.2.1.4 transmit contexts....................................................................................................... .315 7.2.2 transmit descriptors...................................................................................................... .... 316 7.2.2.1 legacy transmit descriptor format ................................................................................317 7.2.2.1.1 address (64) .......................................................................................................... ..317 7.2.2.1.2 length................................................................................................................ .....317
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 19 7.2.2.1.3 checksum offset and start ? cso and css ................................................................. 318 7.2.2.1.4 command byte ? cmd.............................................................................................. 318 7.2.2.1.5 status ? sta .......................................................................................................... .. 319 7.2.2.1.6 dd (bit 0) ? descriptor done status ........................................................................... 320 7.2.2.1.7 vlan.................................................................................................................. ..... 320 7.2.2.2 advanced transmit context descriptor............................................................................ 320 7.2.2.2.1 iplen (9)............................................................................................................. .... 320 7.2.2.2.2 maclen (7) ............................................................................................................ . 320 7.2.2.2.3 ipsec sa idx (8)...................................................................................................... . 321 7.2.2.2.4 reserved (24) ......................................................................................................... . 321 7.2.2.2.5 ips_esp_len (9) ...................................................................................................... 3 21 7.2.2.2.6 tucmd (11) ............................................................................................................ . 321 7.2.2.2.7 dtyp (4).............................................................................................................. .... 322 7.2.2.2.8 rsv (5) ............................................................................................................... .... 322 7.2.2.2.9 dext.................................................................................................................. ..... 322 7.2.2.2.10 rsv (6) .............................................................................................................. ..... 322 7.2.2.2.11 idx (3).............................................................................................................. ...... 322 7.2.2.2.12 rsv (1) .............................................................................................................. ..... 322 7.2.2.2.13 l4len (8) ............................................................................................................ .... 322 7.2.2.2.14 mss (16) ............................................................................................................. .... 322 7.2.2.3 advanced transmit data descriptor ................................................................................ 323 7.2.2.3.1 address (64) .......................................................................................................... .. 324 7.2.2.3.2 dtalen (16) ........................................................................................................... . 324 7.2.2.3.3 rsv (2) ............................................................................................................... .... 324 7.2.2.3.4 mac (2)............................................................................................................... .... 324 7.2.2.3.5 dtyp (4).............................................................................................................. .... 324 7.2.2.3.6 dcmd (8) .............................................................................................................. .. 324 7.2.2.3.7 sta (4) ............................................................................................................... .... 325 7.2.2.3.8 idx (3)............................................................................................................... ..... 325 7.2.2.3.9 rsv (1) ............................................................................................................... .... 325 7.2.2.3.10 popts (6)............................................................................................................ .... 325 7.2.2.3.11 paylen (18) .......................................................................................................... .. 326 7.2.2.4 transmit descriptor ring structure................................................................................. 326 7.2.2.5 transmit descriptor fetching ......................................................................................... 328 7.2.2.6 transmit descriptor write-back ..................................................................................... 329 7.2.3 tx completions head write-back ..........................................................................................33 0 7.2.3.1 description ............................................................................................................. .... 330 7.2.4 tcp/udp segmentation ...................................................................................................... .330 7.2.4.1 assumptions ............................................................................................................. .. 331 7.2.4.2 transmission process ................................................................................................... 3 31 7.2.4.2.1 tcp segmentation data fetch control.......................................................................... 332 7.2.4.2.2 tcp segmentation write-back modes........................................................................... 332 7.2.4.3 tcp segmentation performance ..................................................................................... 333 7.2.4.4 packet format ........................................................................................................... .. 333 7.2.4.5 tcp/udp segmentation indication .................................................................................. 334 7.2.4.6 transmit checksum offloading with tcp/ud segmentation ................................................335 7.2.4.7 ip/tcp/udp header updating ........................................................................................ 336 7.2.4.7.1 tcp/ip/udp header for the first frames ...................................................................... 336 7.2.4.7.2 tcp/ip/udp headers for the subsequent frames........................................................... 337 7.2.4.7.3 tcp/ip/udp headers for the last frame ....................................................................... 338 7.2.4.8 ip/tcp/udp checksum offloading .................................................................................. 338 7.2.4.9 data flow ............................................................................................................... .... 338 7.2.5 checksum offloading in non-segmentation mode ...................................................................339 7.2.5.1 ip checksum ............................................................................................................. .. 340 7.2.5.2 tcp checksum............................................................................................................ . 340 7.2.5.3 sctp crc offloading .................................................................................................... 3 41 7.2.5.4 checksum supported per packet types ........................................................................... 341 7.2.6 multiple transmit queues .................................................................................................. ..342 7.2.6.1 bandwidth allocation to virtual machines / transmit queues ..............................................342 7.3 interrupts.................................................................................................................. ............. 343 7.3.1 mapping of interrupt causes ............................................................................................... .343
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 20 7.3.1.1 legacy and msi interrupt modes ....................................................................................343 7.3.1.2 msi-x mode ? non-iov mode .......................................................................................344 7.3.1.3 msi-x interrupts in sr-iov mode...................................................................................346 7.3.2 registers................................................................................................................. ......... 347 7.3.2.1 interrupt cause register (icr) ......................................................................................348 7.3.2.1.1 legacy mode ........................................................................................................... .348 7.3.2.1.2 advanced mode ........................................................................................................3 48 7.3.2.2 interrupt cause set register (ics) .................................................................................349 7.3.2.3 interrupt mask set/read register (ims)..........................................................................349 7.3.2.4 interrupt mask clear register (imc) ...............................................................................349 7.3.2.5 interrupt acknowledge auto-mask register (iam) .............................................................349 7.3.2.6 extended interrupt cause registers (eicr) .....................................................................349 7.3.2.6.1 msi/int-a mode .......................................................................................................3 49 7.3.2.6.2 msi-x mode ............................................................................................................ .350 7.3.2.7 extended interrupt cause set register (eics) .................................................................350 7.3.2.8 extended interrupt mask set and read register (eims) & extended interrupt mask clear register (eimc) ................................................................350 7.3.2.9 extended interrupt auto clear enable register (eiac).......................................................350 7.3.2.10 extended interrupt auto mask enable register (eiam) ......................................................350 7.3.2.11 gpie ................................................................................................................... .......351 7.3.3 msi-x and vectors......................................................................................................... .... 351 7.3.3.1 usage of spare msi-x vectors by physical function ..........................................................352 7.3.3.2 interrupt moderation .................................................................................................... 352 7.3.3.2.1 more on using eitr ...................................................................................................3 54 7.3.4 clearing interrupt causes ................................................................................................. .. 354 7.3.4.1 auto-clear .............................................................................................................. ....355 7.3.4.2 write to clear .......................................................................................................... ....355 7.3.4.3 read to clear ........................................................................................................... ...355 7.3.5 rate controlled low latency interrupts (lli) ........................................................................ 355 7.3.5.1 rate control mechanism ...............................................................................................356 7.3.6 tcp timer interrupt....................................................................................................... .... 356 7.3.6.1 introduction ............................................................................................................ ....356 7.3.6.2 description............................................................................................................. .....357 7.4 802.1q vlan support......................................................................................................... ...... 357 7.4.1 802.1q vlan packet format................................................................................................ 3 57 7.4.2 802.1q tagged frames ...................................................................................................... 358 7.4.3 transmitting and receiving 802.1q packets .......................................................................... 358 7.4.3.1 adding 802.1q tags on transmits ..................................................................................358 7.4.3.2 stripping 802.1q tags on receives .................................................................................358 7.4.4 802.1q vlan packet filtering .............................................................................................. 359 7.4.5 double vlan support ....................................................................................................... . 360 7.4.5.1 transmit behavior....................................................................................................... .360 7.4.5.2 receive behavior ........................................................................................................ .360 7.5 configurable led outputs .................................................................................................... ..... 361 7.5.1 mode encoding for led outputs.......................................................................................... 361 7.6 memory error correction and detection ...................................................................................... 3 62 7.7 dca......................................................................................................................... .............. 363 7.7.1 description ............................................................................................................... ........ 363 7.7.2 details of implementation ................................................................................................. . 364 7.7.2.1 pcie message format for dca .......................................................................................364 7.8 transmit rate limiting (trl)................................................................................................ ..... 365 7.9 next generation security.................................................................................................... ...... 368 7.9.1 macsec .................................................................................................................... ....... 368 7.9.1.1 packet format .......................................................................................................... ..368 7.9.1.2 macsec header (sectag) format ...................................................................................369 7.9.1.2.1 macsec ethertype.....................................................................................................3 69 7.9.1.2.2 tci and an ............................................................................................................ ..369 7.9.1.2.3 short length .......................................................................................................... ..370 7.9.1.2.4 packet number (pn) ..................................................................................................37 0 7.9.1.2.5 secure channel identifier (sci) ..................................................................................370 7.9.1.2.6 initial value (iv) calculation .......................................................................................3 70
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 21 7.9.1.3 macsec management ? kay (key agreement entity) ........................................................ 370 7.9.1.4 receive flow ............................................................................................................ ... 371 7.9.1.4.1 macsec receive modes.............................................................................................. 372 7.9.1.4.2 receive sa exhausting ? re-keying............................................................................. 373 7.9.1.4.3 receive sa context and identification.......................................................................... 373 7.9.1.4.4 receive statistic counters .......................................................................................... 37 3 7.9.1.5 transmit flow........................................................................................................... ... 373 7.9.1.5.1 transmit sa exhausting ? re-keying ........................................................................... 374 7.9.1.5.2 transmit sa context ................................................................................................. 37 4 7.9.1.5.3 transmit statistic counters ........................................................................................ 374 7.9.1.6 manageability engine/ host relations.............................................................................. 375 7.9.1.6.1 key and tamper protection ........................................................................................ 375 7.9.1.6.2 key protection ........................................................................................................ .. 375 7.9.1.6.3 tamper protection..................................................................................................... 375 7.9.1.6.4 macsec control switch between firmware and software ................................................375 7.9.1.7 manageability flow...................................................................................................... . 375 7.9.1.7.1 initialization ........................................................................................................ ..... 375 7.9.1.7.2 operation flow ........................................................................................................ .. 376 7.9.1.8 switching ownership between host and manageability.......................................................376 7.9.2 ipsec support............................................................................................................. .......376 7.9.2.1 related rfcs and other references ................................................................................ 376 7.9.2.2 hardware features list ................................................................................................. 3 76 7.9.2.2.1 main features......................................................................................................... .. 376 7.9.2.2.2 cross features ........................................................................................................ . 377 7.9.2.3 software/hardware demarcation.................................................................................... 378 7.9.2.4 ipsec formats exchanged between hardware and software ...............................................379 7.9.2.4.1 single send........................................................................................................... ... 379 7.9.2.4.2 single send with tcp/udp checksum offload ............................................................... 379 7.9.2.4.3 large send tcp/udp ................................................................................................. 380 7.9.2.5 tx sa table ............................................................................................................. ... 382 7.9.2.5.1 tx sa table structure................................................................................................ 3 82 7.9.2.5.2 access to tx sa table................................................................................................ 3 83 7.9.2.6 tx hardware flow ........................................................................................................ 383 7.9.2.6.1 single send without tcp/udp checksum offload: ......................................................... 383 7.9.2.6.2 single send with tcp/udp checksum offload: .............................................................. 383 7.9.2.6.3 large send tcp/udp: ................................................................................................ 384 7.9.2.7 aes-128 operation in tx............................................................................................... 38 5 7.9.2.7.1 aes-128-gcm for esp ? both authenticate and encrypt ................................................386 7.9.2.7.2 aes-128-gmac for esp ? authenticate only ................................................................ 386 7.9.2.7.3 aes-128-gmac for ah ? authenticate only ................................................................. 386 7.9.2.8 rx descriptors.......................................................................................................... ... 386 7.9.2.9 rx sa table ............................................................................................................. ... 386 7.9.2.9.1 rx sa table structure ............................................................................................... 38 6 7.9.2.9.2 normal access to rx sa table .................................................................................... 387 7.9.2.9.3 debugging read access to rx sa table........................................................................ 388 7.9.2.10 rx hardware flow without tcp/udp checksum offload.....................................................388 7.9.2.11 rx hardware flow with tcp/udp checksum offload ......................................................... 389 7.9.2.12 aes-128 operation in rx .............................................................................................. 38 9 7.9.2.13 handling ipsec packets in rx ......................................................................................... 38 9 7.10 virtualization ............................................................................................................. ............. 390 7.10.1 overview ................................................................................................................. .........390 7.10.1.1 direct assignment model............................................................................................... 3 91 7.10.1.1.1 rationale ............................................................................................................ ..... 391 7.10.1.2 system overview ........................................................................................................ . 392 7.10.1.3 vmdq1 versus next generation vmdq ............................................................................ 395 7.10.2 pci sig sr-iov support ................................................................................................... ...395 7.10.2.1 sr-iov concepts ........................................................................................................ . 395 7.10.2.2 config space replication ............................................................................................... 395 7.10.2.2.1 legacy pci config space............................................................................................ 39 6 7.10.2.2.2 memory bars assignment ......................................................................................... 396 7.10.2.2.3 pcie capability structure ........................................................................................... 3 97
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 22 7.10.2.2.4 pci-express capability structure..................................................................................397 7.10.2.2.5 msi and msi-x capabilities ........................................................................................397 7.10.2.2.6 vpd capability ....................................................................................................... ...398 7.10.2.2.7 power management capability ....................................................................................398 7.10.2.2.8 serial id ............................................................................................................ ......398 7.10.2.2.9 error reporting capabilities (advanced & legacy).........................................................398 7.10.2.3 function level reset (flr) capability .............................................................................398 7.10.2.4 error reporting ........................................................................................................ ....398 7.10.2.5 ari & iov capability structures .....................................................................................399 7.10.2.6 requester id allocation................................................................................................ .399 7.10.2.6.1 bus-device-function layout .......................................................................................399 7.10.2.6.1.1 ari mode ..........................................................................................................3 99 7.10.2.6.1.2 non ari mode ....................................................................................................400 7.10.2.7 hardware resources assignment....................................................................................400 7.10.2.7.1 physical function resources .......................................................................................400 7.10.2.7.2 resource summary ...................................................................................................40 1 7.10.2.8 csr organization ....................................................................................................... ..401 7.10.2.9 iov control ............................................................................................................ .....401 7.10.2.9.1 vf to pf mailbox ..................................................................................................... ..401 7.10.2.10 interrupt handling .................................................................................................... ...404 7.10.2.10.1 low latency interrupts .............................................................................................. .404 7.10.2.10.2 msi-x............................................................................................................... .......404 7.10.2.10.3 msi................................................................................................................. ........404 7.10.2.10.4 legacy interrupt (int-x)............................................................................................ 405 7.10.2.11 dma................................................................................................................... ........405 7.10.2.11.1 requester id ........................................................................................................ ....405 7.10.2.11.2 sharing dma resources .............................................................................................40 5 7.10.2.11.3 dca ................................................................................................................. .......405 7.10.2.12 timers and watchdog ................................................................................................... 405 7.10.2.12.1 tcp timer ........................................................................................................... .....405 7.10.2.12.2 ieee 1588........................................................................................................... .....405 7.10.2.12.3 watchdog. ........................................................................................................... ....405 7.10.2.12.4 free running timer .................................................................................................. .405 7.10.2.13 power management and wakeup....................................................................................406 7.10.2.14 link control .......................................................................................................... ......406 7.10.2.14.1 special filtering options........................................................................................... ..406 7.10.2.14.2 allocation of memory space for iov functions ...............................................................406 7.10.3 packet switching ......................................................................................................... ...... 406 7.10.3.1 assumptions............................................................................................................ ....406 7.10.3.2 vf selection ........................................................................................................... .....407 7.10.3.2.1 filtering capabilities ............................................................................................... ...407 7.10.3.3 l2 filtering........................................................................................................... .......407 7.10.3.4 size filtering ......................................................................................................... ......407 7.10.3.5 rx packets switching ................................................................................................... 408 7.10.3.5.1 replication mode enabled...........................................................................................40 8 7.10.3.5.2 replication mode disabled ..........................................................................................41 0 7.10.3.6 tx packets switching................................................................................................... .412 7.10.3.6.1 replication mode enabled...........................................................................................41 4 7.10.3.6.2 replication mode disabled ..........................................................................................41 5 7.10.3.7 mirroring support...................................................................................................... ...416 7.10.3.8 offloads............................................................................................................... .......417 7.10.3.8.1 replication by exact mac address ...............................................................................417 7.10.3.8.2 replication by promiscuous modes...............................................................................417 7.10.3.8.3 replication by mirroring ............................................................................................. 417 7.10.3.8.4 vlan only filtering .................................................................................................. .418 7.10.3.8.5 local traffic offload................................................................................................ ...418 7.10.3.8.6 small packets padding ...............................................................................................4 18 7.10.3.9 security features ...................................................................................................... ...418 7.10.3.9.1 inbound security ..................................................................................................... .418 7.10.3.9.2 outbound security .................................................................................................... 419 7.10.3.9.2.1 anti spoofing .....................................................................................................4 19
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 23 7.10.3.9.2.2 vlan insertion from register instead of descriptor ................................................419 7.10.3.9.2.3 egress vlan filtering .......................................................................................... 419 7.10.3.9.3 interrupt misbehavior of vm. ...................................................................................... 419 7.10.3.10 congestion control.................................................................................................... ... 420 7.10.3.10.1 receive priority .................................................................................................... .... 420 7.10.3.10.2 queue arbitration and rate control ............................................................................. 420 7.10.3.10.3 storm control....................................................................................................... .... 420 7.10.3.10.3.1 assumptions ...................................................................................................... 4 20 7.10.3.10.3.2 storm control functionality.................................................................................. 421 7.10.3.11 external switch loopback support.................................................................................. 421 7.10.3.12 switch control ........................................................................................................ ..... 422 7.10.4 virtualization of the hardware ........................................................................................... ...422 7.10.4.1 per pool statistics .................................................................................................... .... 422 7.11 time sync (ieee1588 and 802.1as).......................................................................................... 4 23 7.11.1 overview ................................................................................................................. .........423 7.11.2 flow and hardware/software responsibilities .........................................................................423 7.11.2.1 timesync indications in receive and transmit packet descriptors.......................................425 7.11.3 hardware time sync elements .............................................................................................4 25 7.11.3.1 system time structure and mode of operation................................................................. 425 7.11.3.2 time stamping mechanism............................................................................................ 426 7.11.3.3 time adjustment mode of operation ............................................................................... 427 7.11.4 time sync related auxiliary elements ...................................................................................42 7 7.11.4.1 target time ............................................................................................................ .... 427 7.11.4.2 time stamp events ...................................................................................................... 428 7.11.5 ptp packet structure ..................................................................................................... .....428 7.12 statistics ................................................................................................................. ............... 431 7.12.1 ieee 802.3 clause 30 management.......................................................................................431 7.12.2 oid_gen_statistics....................................................................................................... .433 7.12.3 rmon ..................................................................................................................... ..........433 7.12.4 linux net_device_stats................................................................................................... .....434 7.12.5 macsec statistics ........................................................................................................ .......435 7.12.6 rx statistics............................................................................................................ ...........435 7.12.7 statistics hierarchy. .................................................................................................... ........437 8.0 programming interface ............................................................................................................441 8.1 introduction................................................................................................................ ............ 441 8.1.1 memory and i/o address decoding .......................................................................................441 8.1.1.1 memory-mapped access to internal registers and memories ..............................................441 8.1.1.2 memory-mapped access to flash .................................................................................... 442 8.1.1.3 memory-mapped access to msi-x tables......................................................................... 442 8.1.1.4 memory-mapped access to expansion rom...................................................................... 442 8.1.1.5 i/o-mapped access to internal registers, memories, and flash ..........................................442 8.1.1.5.1 ioaddr (i/o offset 0x00) .......................................................................................... 442 8.1.1.5.2 iodata (i/o offset 0x04) .......................................................................................... 443 8.1.1.5.3 undefined i/o offsets ................................................................................................ 4 44 8.1.2 register conventions ...................................................................................................... ....444 8.1.2.1 registers byte ordering ................................................................................................ 4 46 8.1.3 register summary.......................................................................................................... ....447 8.1.4 msi-x bar register summary .............................................................................................466 8.2 general register descriptions............................................................................................... ..... 466 8.2.1 device control register - ctrl (0x00000; r/w) .....................................................................466 8.2.2 device status register - status (0x00008; r) ......................................................................470 8.2.3 extended device control register - ctrl_ext (0x00018; r/w) ................................................472 8.2.4 mdi control register - mdic (0x00020; r/w) ........................................................................475 8.2.5 serdes ana - serdesctl (0x00024; r/w) ...........................................................................476 8.2.6 copper/fiber switch control - connsw (0x00034; r/w).........................................................476 8.2.7 vlan ether type - vet (0x00038; r/w) ................................................................................477 8.2.8 led control - ledctl (0x00e00; rw) ...................................................................................477 8.3 packet buffers control register descriptions ............................................................................... 4 78 8.3.1 rx pb size - rxpbs (0x2404; rw) .......................................................................................478 8.3.2 tx pb size - txpbs (0x3404; rw)........................................................................................479
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 24 8.3.3 switch pb size - swpbs (0x3004; rw) ................................................................................ 479 8.3.4 tx packet buffer wrap around counter - pbtwac (0x34e8; ro).............................................. 479 8.3.5 rx packet buffer wrap around counter - pbrwac (0x24e8; ro) ............................................. 479 8.3.6 switch packet buffer wrap around counter - pbswac (0x30e8; ro)........................................ 480 8.4 eeprom/flash register descriptions .......................................................................................... 480 8.4.1 eeprom/flash control register - eec (0x00010; r/w) ........................................................... 480 8.4.2 eeprom read register - eerd (0x00014; rw)...................................................................... 482 8.4.3 flash access - fla (0x0001c; r/w) ..................................................................................... 482 8.4.4 flash opcode - flashop (0x0103c; r/w) ............................................................................ 483 8.4.5 eeprom diagnostic - eediag (0x01038; ro)........................................................................ 483 8.4.6 eeprom auto read bus control - eearbc (0x01024; r/w)..................................................... 484 8.4.7 vpd diagnostic register -vpddiag (0x1060; ro) ................................................................... 485 8.4.8 mng-eeprom csr i/f ....................................................................................................... 4 86 8.4.8.1 mng eeprom control register - eemngctl (0x1010; ro) ................................................486 8.4.8.2 mng eeprom read/write data - eemngdata (0x1014; ro)..............................................487 8.5 flow control register descriptions .......................................................................................... ... 487 8.5.1 flow control address low - fcal (0x00028; ro) ................................................................... 487 8.5.2 flow control address high - fcah (0x0002c; ro) ................................................................. 487 8.5.3 flow control type - fct (0x00030; r/w) ............................................................................. 487 8.5.4 flow control transmit timer value - fcttv (0x00170; r/w) ................................................... 488 8.5.5 flow control receive threshold low - fcrtl0 (0x02160; r/w) .............................................. 488 8.5.6 flow control receive threshold high - fcrth0 (0x02168; r/w) .............................................. 489 8.5.7 flow control refresh threshold value - fcrtv (0x02460; r/w)............................................... 489 8.5.8 flow control status - fcsts0 (0x2464; ro) ......................................................................... 489 8.6 pcie register descriptions .................................................................................................. ...... 490 8.6.1 pcie control - gcr (0x05b00; rw) ..................................................................................... 490 8.6.2 iov control- iovctl (0x05bbc; rw) ................................................................................... 492 8.6.3 function tag - functag (0x05b08; r/w) ............................................................................ 492 8.6.4 function active and power state to mng - factps (0x05b30; ro)........................................... 493 8.6.5 serdes/ccm/pcie csr - gioanactl0 (0x05b34; r/w).......................................................... 494 8.6.6 serdes/ccm/pcie csr - gioanactl1 (0x05b38; r/w).......................................................... 494 8.6.7 serdes/ccm/pcie csr - gioanactl2 (0x05b3c; r/w) ......................................................... 494 8.6.8 serdes/ccm/pcie csr - gioanactl3 (0x05b40; r/w).......................................................... 494 8.6.9 serdes/ccm/pcie csr - gioanactlall (0x05b44; r/w) ...................................................... 495 8.6.10 serdes/ccm/pcie csr - ccmctl (0x05b48; r/w)................................................................. 495 8.6.11 serdes/ccm/pcie csr - scctl (0x05b4c; r/w)................................................................... 495 8.6.12 mirrored revision id - mrevid (0x05b64; r/w) .................................................................... 496 8.7 semaphore registers ......................................................................................................... ....... 496 8.7.1 software semaphore - swsm (0x05b50; r/w)...................................................................... 496 8.7.2 firmware semaphore - fwsm (0x05b54; r/ws) ................................................................... 497 8.7.3 software?firmware synchronization - sw_fw_sync (0x05b5c; rws)..................................... 498 8.8 interrupt register descriptions ............................................................................................. ..... 499 8.8.1 extended interrupt cause - eicr (0x01580; rc/w1c)............................................................ 499 8.8.2 extended interrupt cause set - eics (0x01520; wo)............................................................. 500 8.8.3 extended interrupt mask set/read - eims (0x01524; rws).................................................... 501 8.8.4 extended interrupt mask clear - eimc (0x01528; wo) ........................................................... 502 8.8.5 extended interrupt auto clear - eiac (0x0152c; r/w) ........................................................... 502 8.8.6 extended interrupt auto mask enable - eiam (0x01530; r/w)................................................. 503 8.8.7 interrupt cause read register - icr (0x01500; rc/w1c) ....................................................... 504 8.8.8 interrupt cause set register - ics (0x01504; wo) ................................................................ 506 8.8.9 interrupt mask set/read register - ims (0x01508; r/w)........................................................ 507 8.8.10 interrupt mask clear register - imc (0x0150c; wo) .............................................................. 508 8.8.11 interrupt acknowledge auto mask register - iam (0x01510; r/w) ........................................... 510 8.8.12 interrupt throttle - eitr (0x01680 + 4*n [n = 0...24]; r/w).................................................. 510 8.8.13 interrupt vector allocation registers - ivar (0x1700 + 4*n [n=0...7]; rw) .............................. 511 8.8.14 interrupt vector allocation registers - misc ivar_misc (0x1740; rw) .................................... 512 8.8.15 general purpose interrupt enable - gpie (0x1514; rw) ......................................................... 512 8.9 msi-x table register descriptions ........................................................................................... .. 513 8.9.1 msi?x table entry lower address - msixtadd (bar3: 0x0000 + 0x10*n [n=0...24]; r/w).......................................................... 513
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 25 8.9.2 msi?x table entry upper address - msixtuadd (bar3: 0x0004 + 0x10*n [n=0...24]; r/w) ........................................................514 8.9.3 msi?x table entry message - msixtmsg (bar3: 0x0008 + 0x10*n [n=0...24]; r/w) ..........................................................514 8.9.4 msi?x table entry vector control - msixtvctrl (bar3: 0x000c + 0x10*n [n=0...24]; r/w) .......................................................514 8.9.5 msixpba bit description ? msixpba (bar3: 0x02000; ro) ...........................................................................................514 8.9.6 msi-x pba clear ? pbacl (0x05b68; r/w1c) ........................................................................515 8.10 receive register descriptions.............................................................................................. ...... 515 8.10.1 receive control register - rctl (0x00100; r/w) ...................................................................515 8.10.2 split and replication receive control - srrctl (0x0c00c + 0x40*n [n=0...15]; r/w) ................518 8.10.3 packet split receive type - psrtype (0x05480 + 4*n [n=0...7]; r/w) .....................................519 8.10.4 replicated packet split receive type - rplpsrtype (0x054c0; r/w) ........................................520 8.10.5 receive descriptor base address low - rdbal (0x0c000 + 0x40*n [n=0...15]; r/w) .................521 8.10.6 receive descriptor base address high - rdbah (0x0c004 + 0x40*n [n=0...15]; r/w)................521 8.10.7 receive descriptor ring length - rdlen (0x0c008 + 0x40*n [n=0...15]; r/w) .........................521 8.10.8 receive descriptor head - rdh (0x0c010 + 0x40*n [n=0...15]; ro) ........................................522 8.10.9 receive descriptor tail - rdt (0x0c018 + 0x40*n [n=0...15]; r/w).........................................522 8.10.10 receive descriptor control - rxdctl (0x0c028 + 0x40*n [n=0...15]; r/w) ..............................523 8.10.11 receive queue drop packet count - rqdpc (0xc030 + 0x40*n [n=0...15]; rc) .........................524 8.10.12 dma rx max outstanding data - drxmxod (0x2540; rw) ......................................................524 8.10.13 receive checksum control - rxcsum (0x05000; r/w) ............................................................525 8.10.14 receive long packet maximum length - rlpml (0x5004; r/w) ................................................526 8.10.15 receive filter control register - rfctl (0x05008; r/w) ..........................................................526 8.10.16 multicast table array - mta (0x05200 + 4*n [n=0...127]; r/w)...............................................527 8.10.17 receive address low - ral (0x05400 + 8*n [n=0...15]; 0x054e0 + 8*n [n=0...7]; r/w) ..........................................................................................528 8.10.18 receive address high - rah (0x05404 + 8*n [n=0...15]; 0x054e4 + 8*n [n=0...7]; r/w) ..........529 8.10.19 vlan filter table array - vfta (0x05600 + 4*n [n=0...127]; r/w) ..........................................530 8.10.20 multiple receive queues command register - mrqc (0x05818; r/w) .......................................531 8.10.21 rss random key register - rssrk (0x05c80 + 4*n [n=0...9]; r/w) .......................................532 8.10.22 redirection table - reta (0x05c00 + 4*n [n=0...31]; r/w) ....................................................533 8.11 filtering register descriptions ............................................................................................ ....... 534 8.11.1 immediate interrupt rx - imir (0x05a80 + 4*n [n=0...7]; r/w) .............................................534 8.11.2 immediate interrupt rx ext. - imirext (0x05aa0 + 4*n [n=0...7]; r/w)..................................535 8.11.3 source address queue filter - saqf (0x5980 + 4*n[n=0...7]; rw) ..........................................535 8.11.4 destination address queue filter - daqf (0x59a0 + 4*n[n=0...7]; rw) ....................................536 8.11.5 source port queue filter - spqf (0x59c0 + 4*n[n=0...7]; rw)................................................536 8.11.6 5-tuple queue filter - ftqf (0x59e0 + 4*n[n=0...7]; rw) ......................................................536 8.11.7 immediate interrupt rx vlan priority - imirvp (0x05ac0; r/w) ..............................................537 8.11.8 syn packet queue filter - synqf (0x55fc; rw).....................................................................537 8.11.9 etype queue filter - etqf (0x5cb0 + 4*n[n=0...7]; rw) .......................................................537 8.12 transmit register descriptions ............................................................................................. ..... 538 8.12.1 transmit control register - tctl (0x00400; r/w) ..................................................................538 8.12.2 transmit control extended - tctl_ext (0x0404; r/w) ...........................................................539 8.12.3 transmit ipg register - tipg (0x0410; r/w) .........................................................................540 8.12.4 dma tx control - dtxctl (0x03590; r/w) ............................................................................541 8.12.5 dma tx tcp flags control low - dtxtcpflgl (0x359c; rw) ...................................................542 8.12.6 dma tx tcp flags control high - dtxtcpflgh (0x35a0; rw)..................................................543 8.12.7 dma tx max total allow size requests - dtxmxszrq (0x3540; rw) ........................................543 8.12.8 transmit descriptor base address low - tdbal (0xe000 + 0x40*n [n=0...15]; r/w)..................543 8.12.9 transmit descriptor base address high - tdbah (0x0e004 + 0x40*n [n=0...15]; r/w)...............543 8.12.10 transmit descriptor ring length - tdlen (0x0e008 + 0x40*n [n=0...15]; r/w) ........................544 8.12.11 transmit descriptor head - tdh (0x0e010 + 0x40*n [n=0...15]; ro) .......................................544 8.12.12 transmit descriptor tail - tdt (0x0e018 + 0x40*n [n=0...15]; r/w)........................................545 8.12.13 transmit descriptor control - txdctl (0x0e028 + 0x40*n [n=0...15]; r/w) .............................545 8.12.14 tx descriptor completion write?back address low - tdwbal (0x0e038 + 0x40*n [n=0...15]; r/w)......................................................................547 8.12.15 tx descriptor completion write?back address high - tdwbah (0x0e03c + 0x40*n [n=0...15];r/w) ......................................................................547
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 26 8.13 dca register descriptions .................................................................................................. ...... 547 8.13.1 rx dca control registers - rxctl (0x0c014 + 0x40*n [n=0...15]; r/w) ................................. 547 8.13.2 tx dca control registers - txctl (0x0e014 + 0x40*n [n=0...15]; r/w).................................. 549 8.13.3 dca requester id information - dca_id (0x05b70; ro) ........................................................ 550 8.13.4 dca control - dca_ctrl (0x05b74; r/w) ............................................................................ 551 8.14 virtualization register descriptions ....................................................................................... ..... 551 8.14.1 next generation vmdq control register ? vt_ctl (0x0581c; r/w) .......................................... 552 8.14.2 physical function mailbox - pfmailbox (0x0c00 + 4*n[n=0...7]; rw) ....................................... 552 8.14.3 virtual function mailbox - vfmailbox (0x0c40 + 4*n [n=0...7]; rw) ........................................ 553 8.14.4 virtualization mailbox memory - vmbmem (0x0800:0x083c + 0x40*n [n=0...7]; r/w) ............... 553 8.14.5 mailbox vf interrupt causes register - mbvficr (0x0c80; r/w1c) ......................................... 554 8.14.6 mailbox vf interrupt mask register - mbvfimr (0x0c84; rw)................................................. 554 8.14.7 flr events - vflre (0x0c88; r/w1c).................................................................................. 554 8.14.8 vf receive enable- vfre (0x0c8c; rw) ............................................................................... 555 8.14.9 vf transmit enable - vfte (0x0c90; rw)............................................................................. 555 8.14.10 wrong vm behavior register - wvbr (0x3554; rc) ............................................................... 555 8.14.11 vm error count mask ? vmecm (0x3510; rw)....................................................................... 555 8.14.12 last vm misbehavior cause ? lvmmc (0x3548; rc) ............................................................... 556 8.14.13 queue drop enable register - qde (0x2408;rw) ................................................................... 556 8.14.14 dma tx switch control - dtxswc (0x3500; r/w) .................................................................. 556 8.14.15 vm vlan insert register ? vmvir (0x3700 + 4 *n [n=0..7]; rw)............................................ 557 8.14.16 vm offload register - vmolr (0x05ad0 + 4*n [n=0...7]; rw) ................................................ 557 8.14.17 replication offload register - rplolr (0x05af0; rw) ............................................................ 558 8.14.18 vlan vm filter - vlvf (0x05d00 + 4*n [n=0...31]; rw) ........................................................ 558 8.14.19 unicast table array - uta (0xa000 + 4*n [n=0...127]; wo)................................................... 558 8.14.20 storm control control register- sccrl (0x5db0;rw) ............................................................ 559 8.14.21 storm control status - scsts (0x5db4;ro) ......................................................................... 559 8.14.22 broadcast storm control threshold - bsctrh (0x5db8;rw) ................................................... 560 8.14.23 multicast storm control threshold - msctrh (0x5dbc; rw) ................................................... 560 8.14.24 broadcast storm control current count - bsccnt (0x5dc0;ro).............................................. 560 8.14.25 multicast storm control current count - msccnt (0x5dc4;ro)............................................... 560 8.14.26 storm control time counter - sctc (0x5dc8; ro) ................................................................ 560 8.14.27 storm control basic interval- scbi (0x5dcc; rw)................................................................. 561 8.14.28 virtual mirror rule control - vmrctl (0x5d80 + 0x4*n [n= 0..3]; rw) .................................... 561 8.14.29 virtual mirror rule vlan - vmrvlan (0x5d90 + 0x4*n [n= 0..3]; rw) .................................... 561 8.14.30 virtual mirror rule vm - vmrvm (0x5da0 + 0x4*n [n= 0..3]; rw)........................................... 562 8.14.31 transmit rate-er config - rc (0x36b0; rw) ......................................................................... 562 8.14.32 transmit rate-er status - (0x36b4; ro).............................................................................. 563 8.15 tx bandwidth allocation to vm register description ...................................................................... 563 8.15.1 vm bandwidth allocation control & status - vmbacs (0x3600; rw) ......................................... 563 8.15.2 vm bandwidth allocation max memory window - vmbammw (0x3670; rw) ............................... 563 8.15.3 vm bandwidth allocation select - vmbasel (0x3604; rw) ...................................................... 564 8.15.4 vm bandwidth allocation config - vmbac (0x3608; rw) ......................................................... 564 8.16 timer register descriptions ................................................................................................ ...... 565 8.16.1 watchdog setup - wdstp (0x01040; r/w)........................................................................... 565 8.16.2 watchdog software device status - wdswsts (0x01044; r/w).............................................. 565 8.16.3 free running timer - frtimer (0x01048; rws) ................................................................... 565 8.16.4 tcp timer - tcptimer (0x0104c; r/w) ............................................................................... 566 8.17 time sync register descriptions ............................................................................................ .... 567 8.17.1 rx time sync control register - tsyncrxctl (0xb620;rw)................................................... 567 8.17.2 rx timestamp low - rxstmpl (0x0b624; ro) ...................................................................... 567 8.17.3 rx timestamp high - rxstmph (0x0b628; ro) .................................................................... 567 8.17.4 rx timestamp attributes low - rxsatrl(0x0b62c; ro) ........................................................ 568 8.17.5 rx timestamp attributes high- rxsatrh (0x0b630; ro) ....................................................... 568 8.17.6 tx time sync control register - tsynctxctl (0x0b614; rw) ................................................ 568 8.17.7 tx timestamp value low - txstmpl (0x0b618;ro)............................................................... 568 8.17.8 tx timestamp value high - txstmph(0x0b61c; ro) ............................................................. 568 8.17.9 system time register low - systiml (0x0b600; rws) .......................................................... 569 8.17.10 system time register high - systimh (0x0b604; rws) ........................................................ 569 8.17.11 increment attributes register - timinca (0x0b608; rw) ....................................................... 569 8.17.12 time adjustment offset register low - timadjl (0x0b60c; rw) ............................................. 569
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 27 8.17.13 time adjustment offset register high - timadjh (0x0b610;rw)..............................................569 8.17.14 timesync auxiliary control register - tsauxc (0x0b640; rw).................................................570 8.17.15 target time register 0 low - trgttiml0 (0x0b644; rw)........................................................570 8.17.16 target time register 0 high - trgttimh0 (0x0b648; rw) ......................................................570 8.17.17 target time register 1 low - trgttiml1 (0x0b64c; rw) .......................................................571 8.17.18 target time register 1 high - trgttimh1 (0x0b650; rw) ......................................................571 8.17.19 auxiliary time stamp 0 register low - auxstmpl0 (0x0b65c; ro) ..........................................571 8.17.20 auxiliary time stamp 0 register high -auxstmph0 (0x0b660; ro) ..........................................571 8.17.21 auxiliary time stamp 1 register low auxstmpl1 (0x0b664; ro).............................................571 8.17.22 auxiliary time stamp 1 register high - auxstmph1 (0x0b668; ro) .........................................571 8.17.23 time sync rx configuration - tsyncrxcfg (0x05f50; rw).....................................................572 8.17.24 time sync sdp config reg - tssdp (0x0003c; rw) ...............................................................572 8.18 pcs register descriptions .................................................................................................. ....... 573 8.18.1 pcs configuration - pcs_cfg (0x04200; r/w).......................................................................573 8.18.2 pcs link control - pcs_lctl (0x04208; rw) .........................................................................574 8.18.3 pcs link status - pcs_lsts (0x0420c; ro) ..........................................................................575 8.18.4 an advertisement - pcs_anadv (0x04218; r/w) ..................................................................576 8.18.5 link partner ability - pcs_lpab (0x0421c; ro)......................................................................577 8.18.6 next page transmit - pcs_nptx (0x04220; rw) ....................................................................578 8.18.7 link partner ability next page - pcs_lpabnp (0x04224; ro) ...................................................579 8.18.8 sfp i2c command- i2ccmd (0x01028; r/w) ........................................................................580 8.18.9 sfp i2c parameters - i2cparams (0x0102c; r/w) ................................................................580 8.19 statistics register descriptions........................................................................................... ....... 581 8.19.1 crc error count - crcerrs (0x04000; rc)...........................................................................581 8.19.2 alignment error count - algnerrc (0x04004; rc) ................................................................582 8.19.3 symbol error count - symerrs (0x04008; rc) ......................................................................582 8.19.4 rx error count - rxerrc (0x0400c; rc) ..............................................................................582 8.19.5 missed packets count - mpc (0x04010; rc)...........................................................................582 8.19.6 excessive collisions count - ecol (0x04018; rc) ...................................................................583 8.19.7 multiple collision count - mcc (0x0401c; rc) ........................................................................583 8.19.8 late collisions count - latecol (0x04020; rc) .....................................................................583 8.19.9 collision count - colc (0x04028; rc) ..................................................................................583 8.19.10 defer count - dc (0x04030; rc) .......................................................................................... 583 8.19.11 transmit with no crs - tncrs (0x04034; rc).......................................................................584 8.19.12 host transmit discarded packets by mac count - htdpmc (0x0403c; rc).................................584 8.19.13 receive length error count - rlec (0x04040; rc) .................................................................584 8.19.14 circuit breaker rx dropped packet- cbrdpc (0x04044; rc).....................................................585 8.19.15 xon received count - xonrxc (0x04048; rc) ......................................................................585 8.19.16 xon transmitted count - xontxc (0x0404c; rc) ..................................................................585 8.19.17 xoff received count - xoffrxc (0x04050; rc) ....................................................................585 8.19.18 xoff transmitted count - xofftxc (0x04054; rc) ................................................................585 8.19.19 fc received unsupported count - fcruc (0x04058; rc).........................................................586 8.19.20 packets received [64 bytes] count - prc64 (0x0405c; rc) .....................................................586 8.19.21 packets received [65?127 bytes] count - prc127 (0x04060; rc) ...........................................586 8.19.22 packets received [128?255 bytes] count - prc255 (0x04064; rc)..........................................586 8.19.23 packets received [256?511 bytes] count - prc511 (0x04068; rc)..........................................587 8.19.24 packets received [512?1023 bytes] count - prc1023 (0x0406c; rc) ......................................587 8.19.25 packets received [1024 to max bytes] count - prc1522 (0x04070; rc)....................................587 8.19.26 good packets received count - gprc (0x04074; rc) ..............................................................588 8.19.27 broadcast packets received count - bprc (0x04078; rc)........................................................588 8.19.28 multicast packets received count - mprc (0x0407c; rc) ........................................................588 8.19.29 good packets transmitted count - gptc (0x04080; rc)..........................................................588 8.19.30 good octets received count - gorcl (0x04088; rc) .............................................................589 8.19.31 good octets received count - gorch (0x0408c; rc).............................................................589 8.19.32 good octets transmitted count - gotcl (0x04090; rc) .........................................................589 8.19.33 good octets transmitted count - gotch (04094; rc) ............................................................589 8.19.34 receive no buffers count - rnbc (0x040a0; rc) ...................................................................590 8.19.35 receive undersize count - ruc (0x040a4; rc) ......................................................................590 8.19.36 receive fragment count - rfc (0x040a8; rc) .......................................................................590 8.19.37 receive oversize count - roc (0x040ac; rc)........................................................................590 8.19.38 receive jabber count - rjc (0x040b0; rc) ...........................................................................591
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 28 8.19.39 management packets received count - mngprc (0x040b4; rc).............................................. 591 8.19.40 bmc management packets received count - bmngprc (0x0413c; rc)..................................... 591 8.19.41 management packets dropped count - mpdc (0x040b8; rc) .................................................. 592 8.19.42 bmc management packets dropped count - bmpdc (0x04140; rc) ......................................... 592 8.19.43 management packets transmitted count - mngptc (0x040bc; rc) ......................................... 592 8.19.44 bmc management packets transmitted count - bmngptc (0x04144; rc) ................................. 592 8.19.45 total octets received - torl (0x040c0; rc) ........................................................................ 592 8.19.46 total octets received - torh (0x040c4; rc)........................................................................ 593 8.19.47 total octets transmitted - totl (0x040c8; rc) .................................................................... 593 8.19.48 total octets transmitted - toth (0x040cc; rc) ................................................................... 593 8.19.49 total packets received - tpr (0x040d0; rc) ........................................................................ 593 8.19.50 total packets transmitted - tpt (0x040d4; rc) .................................................................... 594 8.19.51 packets transmitted [64 bytes] count - ptc64 (0x040d8; rc)................................................ 594 8.19.52 packets transmitted [65?127 bytes] count - ptc127 (0x040dc; rc)...................................... 594 8.19.53 packets transmitted [128?255 bytes] count - ptc255 (0x040e0; rc)..................................... 595 8.19.54 packets transmitted [256?511 bytes] count - ptc511 (0x040e4; rc)..................................... 595 8.19.55 packets transmitted [512?1023 bytes] count - ptc1023 (0x040e8; rc) ................................. 595 8.19.56 packets transmitted [1024 bytes or greater] count - ptc1522 (0x040ec; rc).......................... 595 8.19.57 multicast packets transmitted count - mptc (0x040f0; rc).................................................... 596 8.19.58 broadcast packets transmitted count - bptc (0x040f4; rc)................................................... 596 8.19.59 tcp segmentation context transmitted count - tsctc (0x040f8; rc)..................................... 596 8.19.60 circuit breaker rx manageability packet count - cbrmpc (0x040fc; rc) .................................. 596 8.19.61 interrupt assertion count - iac (0x04100; rc) ..................................................................... 597 8.19.62 rx packets to host count - rpthc (0x04104; rc) ................................................................. 597 8.19.63 debug counter 1 - dbgc1 (0x04108; rc) ............................................................................ 597 8.19.64 debug counter 2 - dbgc2 (0x0410c; rc) ............................................................................ 598 8.19.65 debug counter 3 - dbgc3 (0x04110; rc) ............................................................................ 598 8.19.66 debug counter 4 - dbgc4 (0x0411c; rc) ............................................................................ 599 8.19.67 host good packets transmitted count-hgptc (0x04118; rc) ................................................. 599 8.19.68 receive descriptor minimum threshold count-rxdmtc (0x04120; rc)..................................... 599 8.19.69 host tx circuit breaker dropped packets count- htcbdpc (0x04124; rc) ................................ 600 8.19.70 host good octets received count - hgorcl (0x04128; rc) ................................................... 600 8.19.71 host good octets received count - hgorch (0x0412c; rc)................................................... 600 8.19.72 host good octets transmitted count - hgotcl (0x04130; rc) ............................................... 600 8.19.73 host good octets transmitted count - hgotch (0x04134; rc)............................................... 601 8.19.74 length error count - lenerrs (0x04138; rc) ...................................................................... 601 8.19.75 serdes/sgmii code violation packet count - scvpc (0x04228; rw) ........................................ 601 8.19.76 switch security violation packet count - ssvpc (0x41a0; rc) ................................................ 601 8.19.77 switch drop packet count - sdpc (0x41a4; rc).................................................................... 602 8.20 wake up control register descriptions ...................................................................................... . 602 8.20.1 wakeup control register - wuc (0x05800; r/w)................................................................... 602 8.20.2 wakeup filter control register - wufc (0x05808; r/w) ......................................................... 602 8.20.3 wakeup status register - wus (0x05810; r/w1c) ................................................................ 603 8.20.4 wakeup packet length - wupl (0x05900; ro) ...................................................................... 604 8.20.5 wakeup packet memory - wupm (0x05a00 + 4*n [n=0...31]; ro) .......................................... 604 8.20.6 ip address valid - ipav (0x5838; r/w) ................................................................................ 604 8.20.7 ipv4 address table - ip4at (0x05840 + 8*n [n=0...3]; r/w) ................................................. 605 8.20.8 ipv6 address table - ip6at (0x05880 + 4*n [n=0...3]; r/w) ................................................. 605 8.20.9 flexible host filter table registers - fhft (0x09000 - 0x093fc; rw) ....................................... 606 8.20.10 flexible host filter table extended registers - fhft_ext (0x09a00 - 0x09bfc; rw).................. 607 8.21 management register descriptions........................................................................................... .. 607 8.21.1 management vlan tag value - mavtv (0x5010 +4*n [n=0...7]; rw) ..................................... 607 8.21.2 management flex udp/tcp ports - mfutp (0x5030 + 4*n [n=0...7]; rw) ................................ 608 8.21.3 management ethernet type filters- metf (0x5060 + 4*n [n=0...3]; rw) ................................. 608 8.21.4 management control register - manc (0x05820; rw) ........................................................... 608 8.21.5 manageability filters valid - mfval (0x5824; rw) ................................................................. 609 8.21.6 management control to host register - manc2h (0x5860; rw) .............................................. 610 8.21.7 manageability decision filters- mdef (0x5890 + 4*n [n=0...7]; rw) ....................................... 611 8.21.8 manageability decision filters- mdef_ext (0x5930 + 4*n[n=0...7]; rw) ................................. 612 8.21.9 manageability ip address filter - mipaf (0x58b0 + 4*n [n=0...15]; rw) .................................. 612 8.21.10 manageability mac address low - mmal (0x5910 + 8*n [n= 0...3]; rw).................................. 615
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 29 8.21.11 manageability mac address high - mmah (0x5914 + 8*n [n=0...3]; rw) ..................................615 8.21.12 flexible tco filter table registers - ftft (0x09400-0x097fc; rw) ...........................................616 8.22 macsec register descriptions ............................................................................................... .... 617 8.22.1 macsec tx capabilities register - lsectxcap (0xb000; ro) ...................................................617 8.22.2 macsec rx capabilities register - lsecrxcap (0xb300; ro)...................................................618 8.22.3 macsec tx control register - lsectxctrl (0xb004; rw) .......................................................618 8.22.4 macsec rx control register - lsecrxctrl (0xb304; rw) .......................................................619 8.22.5 macsec tx sci low - lsectxscl (0xb008; rw) ...................................................................619 8.22.6 macsec tx sci high - lsectxsch (0xb00c; rw) ..................................................................619 8.22.7 macsec tx sa - lsectxsa (0xb010; rw).............................................................................620 8.22.8 macsec tx sa pn 0 - lsectxpn0 (0xb018; rw) ...................................................................620 8.22.9 macsec tx sa pn 1 - lsectxpn1 (0xb01c; rw) ...................................................................621 8.22.10 macsec tx key 0 - lsectxkey0 (0xb020 + 4*n [n=0...3]; wo)..............................................621 8.22.11 macsec tx key 1 - lsectxkey1 (0xb030 + 4*n [n=0...3]; wo)..............................................621 8.22.12 macsec rx sci low - lsecrxscl (0xb3d0; rw)...................................................................622 8.22.13 macsec rx sci high - lsecrxsch (0xb3e0; rw)..................................................................622 8.22.14 macsec rx sa - lsecrxsa[n] (0xb310 + 4*n [n=0...1]; rw).................................................622 8.22.15 macsec rx sa pn - lsecrxsapn (0xb330 + 4*n [n=0...1]; rw) ............................................623 8.22.16 macsec rx key - lsecrxkey (0xb350 + 16*n [n=0...1] + 4*m (m=0...3); wo).......................623 8.22.17 macsec software/firmware interface- lswfw (0x8f14; ro) ...................................................624 8.22.18 macsec tx port statistics ............................................................................................... .....624 8.22.18.1 tx untagged packet counter - lsectxut (0x4300; rc) ....................................................624 8.22.18.2 encrypted tx packets count - lsectxpkte (0x4304; rc)..................................................624 8.22.18.3 protected tx packets count - lsectxpktp (0x4308; rc) ..................................................625 8.22.18.4 encrypted tx octets count - lsectxocte (0x430c; rc)...................................................625 8.22.18.5 protected tx octets count - lsectxoctp (0x4310; rc)....................................................625 8.22.19 macsec rx port statistic ................................................................................................ .....625 8.22.19.1 macsec untagged rx packet count - lsecrxut (0x4314; rc) ..........................................625 8.22.19.2 macsec rx octets decrypted count - lsecrxocte (0x431c; rc) ......................................626 8.22.19.3 macsec rx octets validated count - lsecrxoctp (0x4320; rc)........................................626 8.22.19.4 macsec rx packet with bad tag count - lsecrxbad (0x4324; rc)....................................626 8.22.19.5 macsec rx packet no sci count - lsecrxnosci (0x4328; rc) .........................................626 8.22.19.6 macsec rx packet unknown sci count - lsecrxunsci (0x432c; rc)................................627 8.22.20 macsec rx sc statistic register descriptions.........................................................................627 8.22.20.1 macsec rx unchecked packets count - lsecrxunch (0x4330; rc)...................................627 8.22.20.2 macsec rx delayed packets count - lsecrxdelay (0x4340; rc)......................................627 8.22.20.3 macsec rx late packets count - lsecrxlate (0x4350; rc) .............................................627 8.22.21 macsec rx sa statistic register descriptions.........................................................................628 8.22.21.1 macsec rx packet ok count - lsecrxok[n] (0x4360+ 4*n [n=0...1]; rc) .........................628 8.22.21.2 macsec rx invalid count - lsecrxinv[n] (0x4380+ 4*n [n=0...1]; rc).............................628 8.22.21.3 macsec rx not valid count - lsecrxnv[n] (0x43a0 + 4*n [n=0...1]; rc)..........................628 8.22.21.4 macsec rx not using sa count - lsecrxnusa (0x43c0; rc) ...........................................628 8.22.21.5 macsec rx unused sa count - lsecrxunsa (0x43d0; rc) ..............................................628 8.23 ipsec registers description ................................................................................................ ....... 629 8.23.1 ipsec control ? ipsctrl (0xb430; rw) ................................................................................629 8.23.2 ipsec tx index - ipstxidx (0xb450; rw) .............................................................................629 8.23.3 ipsec tx key registers - ipstxkey (0xb460 + 4*n [n = 0...3]; rw) .........................................629 8.23.4 ipsec tx salt register - ipstxsalt (0xb454; rw) ..................................................................630 8.23.5 ipsec rx command register - ipsrxcmd (0xb408; rw) .........................................................630 8.23.6 ipsec rx spi register - ipsrxspi (0xb40c; rw) ....................................................................631 8.23.7 ipsec rx key register - ipsrxkey (0xb410 + 4 * n [n = 0..3]; rw) .........................................631 8.23.8 ipsec rx salt register - ipsrxsalt (0xb404; rw) .................................................................631 8.23.9 ipsec rx ip address register - ipsrxipaddr (0xb420 + 4*n [n = 0..3]; rw) ............................632 8.23.10 ipsec rx index - ipsrxidx (0xb400; rw) .............................................................................632 8.24 diagnostic registers description ........................................................................................... ..... 632 8.24.1 receive data fifo head register - rdfh (0x02410; rws) ......................................................632 8.24.2 receive data fifo tail register - rdft (0x02418; rws) .........................................................633 8.24.3 receive data fifo head saved register - rdfhs (0x2420; rws).............................................633 8.24.4 receive data fifo tail saved register - rdfts (0x02428; rws)..............................................633 8.24.5 switch buffer fifo head register - swbfh (0x03010; rws) ...................................................634 8.24.6 switch buffer fifo tail register - swbft (0x03018; rws) ......................................................634
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 30 8.24.7 switch buffers fifo head saved register - swbfhs (0x03020; rws)...................................... 634 8.24.8 switch buffers fifo tail saved register - swbfts (0x03028; rws) ........................................ 635 8.24.9 packet buffer diagnostic - pbdiag (0x02458; r/w) ............................................................... 635 8.24.10 transmit data fifo head register - tdfh (0x03410; rws) .................................................... 635 8.24.11 transmit data fifo tail register - tdft (0x03418; rws)....................................................... 636 8.24.12 transmit data fifo head saved register - tdfhs (0x03420; rws)......................................... 636 8.24.13 transmit data fifo tail saved register - tdfts (0x03428; rws) ........................................... 636 8.24.14 transmit data fifo packet count - tdfpc (0x03430; ro) ...................................................... 637 8.24.15 receive data fifo packet count - rdfpc (0x02430; ro) ....................................................... 637 8.24.16 switch data fifo packet count - swdfpc (0x03030; ro) ...................................................... 637 8.24.17 ipsec packet buffer ecc status - ippbeccsts (0xb470; rc) .................................................. 638 8.24.18 pb slave access control - pbslac (0x3100; rw)................................................................... 638 8.24.19 pb slave access data ? pbslad (0x3110 + 4*n [n= 0...3]; rw) ............................................. 639 8.24.20 rx descriptor handler memory - rdhm (0x06000 + 4*n [n= 0..1023]; ro) .............................. 639 8.24.21 rx descriptor handler memory page number - rdhmp (0x025fc; rw)..................................... 639 8.24.22 tx descriptor handler memory - tdhm (0x07000 + 4*n [n= 0..1023]; ro) .............................. 640 8.24.23 tx descriptor handler memory page number - tdhmp (0x035fc; r/w) .................................... 640 8.24.24 rx packet buffer ecc status - rpbeccsts (0x0245c; rc)...................................................... 641 8.24.25 tx packet buffer ecc status - tpbeccsts (0x0345c; rc) ...................................................... 641 8.24.26 switch packet buffer ecc status - swpbeccsts (0x0305c; rc) ............................................. 642 8.24.27 ipsec packet buffer ecc error inject - ippbeei (0xb474; rw) ................................................. 642 8.24.28 rx descriptor handler ecc status - rdhests (0x025c0; rc) ................................................. 643 8.24.29 tx descriptor handler ecc status - tdhests (0x35c0; rc).................................................... 643 8.24.30 pcie retry buffer ecc status - prbests (0x05ba0; rc) ........................................................ 644 8.24.31 pcie write buffer ecc status - pwbests (0x05bb0; rc) ....................................................... 644 8.24.32 pcie msi-x ecc status - pmsixests (0x05ba8; rc) ............................................................. 644 8.24.33 parity and ecc error indication- peind (0x1084; rc) ............................................................ 645 8.24.34 parity and ecc indication mask ? peindm (0x1088; rw) ........................................................ 646 8.24.35 tx dma performance burst and descriptor count - txbdc (0x35e0; rc) .................................. 647 8.24.36 tx dma performance idle count - txidle (0x35e4; rc) ......................................................... 647 8.24.37 rx dma performance burst and descriptor count - rxbdc (0x25e0; rc) .................................. 648 8.24.38 rx dma performance idle count - rxidle (0x25e4; rc) ........................................................ 648 8.25 phy software interface (phyreg) ............................................................................................ .. 648 8.25.1 phy control register - pctrl (00d; r/w) ............................................................................. 650 8.25.2 phy status register - pstatus (01d; r) .............................................................................. 651 8.25.3 phy identifier register 1 (lsb) - phy id 1 (02d; r) ............................................................... 652 8.25.4 phy identifier register 2 (msb) - phy id 2 (03d; r) .............................................................. 652 8.25.5 auto?negotiation advertisement register - ana (04d; r/w) ................................................... 652 8.25.6 auto?negotiation base page ability register - (05d; r) .......................................................... 653 8.25.7 auto?negotiation expansion register - ane (06d; r) ............................................................. 654 8.25.8 auto?negotiation next page transmit register - npt (07d; r/w)............................................. 655 8.25.9 auto?negotiation next page ability register - lpn (08d; r) .................................................... 655 8.25.10 1000base?t/100base?t2 control register - gcon (09d; r/w) .............................................. 656 8.25.11 1000base?t/100base?t2 status register - gstatus (10d; r) .............................................. 656 8.25.12 extended status register - estatus (15d; r)....................................................................... 657 8.25.13 port configuration register - pconf (16d; r/w).................................................................... 657 8.25.14 port status 1 register - pstat (17d; ro) ............................................................................. 659 8.25.15 port control register - pcont (18d; r/w) ............................................................................ 660 8.25.16 link health register - link (19d; ro).................................................................................. 6 61 8.25.17 1000base?t fifo register - pfifo (20d; r/w)...................................................................... 662 8.25.18 channel quality register - chan (21d; ro) .......................................................................... 662 8.25.19 phy power management - (25d; r/w) .................................................................................. 662 8.25.20 special gigabit disable register - (26d; r/w) ....................................................................... 663 8.25.21 misc. control register 1 - (27d; r/w) .................................................................................. 6 63 8.25.22 misc. control register 2 - (28d; ro) .................................................................................... 664 8.25.23 page select core register - (31d; wo)................................................................................. 66 4 8.26 virtual function device registers.......................................................................................... ...... 665 8.26.1 queues registers ......................................................................................................... ..... 665 8.26.2 non-queue registers ...................................................................................................... ... 665 8.26.2.1 eitr registers......................................................................................................... .....665 8.26.2.2 msi-x registers........................................................................................................ ....665
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 31 8.26.3 register set - csr bar ................................................................................................... ....666 8.26.4 register set - msi-x bar................................................................................................. ....668 8.27 virtual function register descriptions ..................................................................................... .... 668 8.27.1 vt control register - vtctrl (0x0000; rw) ...........................................................................668 8.27.2 vf status register - status (0x00008; ro)..........................................................................668 8.27.3 vt free running timer - vtfrtimer (0x01048; ro)...............................................................669 8.27.4 vt extended interrupt cause - vteicr (0x01580; rc/w1c) ....................................................669 8.27.5 vt extended interrupt cause set - vteics (0x01520; wo) .....................................................669 8.27.6 vt extended interrupt mask set/read - vteims (0x01524; rws).............................................669 8.27.7 vt extended interrupt mask clear - vteimc (0x01528; wo) ....................................................669 8.27.8 vt extended interrupt auto clear - vteiac (0x0152c; r/w)....................................................669 8.27.9 vt extended interrupt auto mask enable - vteiam (0x01530; r/w) .........................................670 8.27.10 vt interrupt throttle - vteitr (0x01680 + 4*n[n = 0...2]; r/w) .............................................670 8.27.11 vt interrupt vector allocation registers - vtivar (0x01700; rw) ............................................670 8.27.12 vt interrupt vector allocation registers - vtivar_misc (0x01740; rw) ...................................671 8.27.13 msi?x table entry lower address - msixtadd (bar3: 0x0000 + 16*n [n=0...2]; r/w)................................................................671 8.27.14 msi?x table entry upper address - msixtuadd (bar3: 0x0004 + 16*n [n=0...2]; r/w)..............................................................671 8.27.15 msi?x table entry message - msixtmsg (bar3: 0x0008 + 16*n [n=0...2]; r/w) ...............................................................671 8.27.16 msi?x table entry vector control - msixtvctrl (bar3: 0x000c + 16*n [n=0...2]; r/w).............................................................671 8.27.17 msixpba - msixpba (bar3: 0x02000; ro) ...........................................................................672 8.27.18 msi?x pba clear - pbacl (0x00f04; r/w1c)........................................................................672 8.27.19 receive descriptor base address low - rdbal (0x02800 + 256*n [n=0...1];r/w)......................672 8.27.20 receive descriptor base address high - rdbah (0x02804 + 256*n [n=0...1]; r/w) ...................672 8.27.21 receive descriptor ring length - rdlen (0x02808 + 256*n [n=0...1]; r/w) .............................672 8.27.22 receive descriptor head - rdh (0x02810 + 256*n [n=0...1]; r/0)...........................................672 8.27.23 receive descriptor tail - rdt (0x02818 + 256*n [n=0...1]; r/w) ............................................673 8.27.24 receive descriptor control - rxdctl (0x02828 + 256*n [n=0...1]; r/w)......................................................................................673 8.27.25 split and replication receive control register queue - srrctl(0x0280c + 256*n [n=0...1]; r/w)...........................................................................673 8.27.26 receive queue drop packet count - rqdpc (0x2830 + 256*n [n=0...1]; rc)..........................................................................................673 8.27.27 replication packet split receive type - psrtype (0x00f0c; r/w)................................................................................................................. 673 8.27.28 transmit descriptor base address low - tdbal (0x3800 + 256*n [n=0...1]; r/w) .......................................................................................673 8.27.29 transmit descriptor base address high - tdbah (0x03804 + 256*n [n=0...1]; r/w)......................................................................................673 8.27.30 transmit descriptor ring length - tdlen (0x03808 + 256*n [n=0...1]; r/w)......................................................................................673 8.27.31 transmit descriptor head - tdh (0x03810 + 256*n [n=0...1]; r/0).......................................................................................673 8.27.32 transmit descriptor tail - tdt (0x03818 + 256*n [n=0...1]; r/w)......................................................................................674 8.27.33 transmit descriptor control - txdctl (0x03828 + 256*n [n=0...1]; r/w)......................................................................................674 8.27.34 tx descriptor completion write?back address low - tdwbal (0x03838 + 256*n [n=0...1]; r/w) .........................................................................674 8.27.35 tx descriptor completion write?back address high - tdwbah (0x0383c + 256*n [n=0...1];r/w) .........................................................................674 8.27.36 rx dca control registers - rxctl (0x02814 + 256*n [n=0...1]; r/w)......................................................................................674 8.27.37 tx dca control registers - txctl (0x03814 + 256*n [n=0...1]; r/w)......................................................................................674 8.27.38 good packets received count - vfgprc (0x0f10; ro) ............................................................674 8.27.39 good packets transmitted count - vfgptc (0x0f14; ro) ........................................................675 8.27.40 good octets received count - vfgorc (0x0f18; ro) .............................................................675
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 32 8.27.41 good octets transmitted count - vfgotc (0x0f34; ro) ........................................................ 675 8.27.42 multicast packets received count - vfmprc (0x0f3c; ro)...................................................... 676 8.27.43 good tx octets loopback count - vfgotlbc (0x0f50; ro)..................................................... 676 8.27.44 good tx packets loopback count - vfgptlbc (0x0f44; ro) ................................................... 676 8.27.45 good rx octets loopback count - vfgorlbc (0x0f48; ro) .................................................... 676 8.27.46 good rx packets loopback count - vfgprlbc (0x0f40; ro) ................................................... 677 8.27.47 virtual function mailbox - vfmailbox (0x0c40; rw) ............................................................... 677 8.27.48 virtualization mailbox memory - vmbmem (0x0800:0x083c; r/w) ........................................... 677 8.27.49 tx packet buffer wrap around counter - pbtwac (0x34e8; ro) ............................................... 677 8.27.50 rx packet buffer wrap around counter - pbrwac (0x24e8; ro)............................................... 677 8.27.51 switch packet buffer wrap around counter - pbswac (0x30e8; ro) ......................................... 678 9.0 pcie programming interface ................................................................................................... 679 9.1 pcie compatibility .......................................................................................................... ......... 679 9.2 configuration sharing among pci functions ................................................................................ 680 9.3 register map................................................................................................................ ........... 680 9.3.1 register attributes ....................................................................................................... ..... 680 9.3.2 pcie configuration space summary ..................................................................................... 682 9.4 mandatory pci configuration registers ....................................................................................... 684 9.4.1 vendor id register (0x0; ro) ............................................................................................. 6 84 9.4.2 device id register (0x2; ro).............................................................................................. 684 9.4.3 command register (0x4; r/w) ........................................................................................... 685 9.4.4 status register (0x6; ro) ................................................................................................. . 686 9.4.5 revision register (0x8; ro)............................................................................................... . 687 9.4.6 class code register (0x9; ro) ............................................................................................ 6 87 9.4.7 cache line size register (0xc; r/w).................................................................................... 687 9.4.8 latency timer register (0xd; ro) ....................................................................................... 687 9.4.9 header type register (0xe; ro).......................................................................................... 68 7 9.4.10 bist register (0xf; ro).................................................................................................. ... 687 9.4.11 base address registers (0x10:0x27; r/w)............................................................................ 688 9.4.11.1 32-bit mapping ......................................................................................................... ...688 9.4.11.2 64-bit mapping without i/o bar.....................................................................................689 9.4.11.3 64-bit mapping without flash bar..................................................................................690 9.4.12 cardbus cis register (0x28; ro) ........................................................................................ 69 1 9.4.13 subsystem vendor id register (0x2c; ro) ........................................................................... 691 9.4.14 subsystem id register (0x2e; ro) ...................................................................................... 691 9.4.15 expansion rom base address register (0x30; ro)................................................................. 691 9.4.16 cap_ptr register (0x34; ro).............................................................................................. . 692 9.4.17 interrupt line register (0x3c; rw)...................................................................................... 6 92 9.4.18 interrupt pin register (0x3d; ro) ....................................................................................... 6 92 9.4.19 max_lat/min_gnt (0x3e; ro) ............................................................................................. 69 2 9.5 pci capabilities ............................................................................................................ ........... 692 9.5.1 pci power management registers........................................................................................ 692 9.5.1.1 capability id register (0x40; ro) ..................................................................................693 9.5.1.2 next pointer (0x41; ro) ...............................................................................................69 3 9.5.1.3 power management capabilities - pmc (0x42; ro) ...........................................................693 9.5.1.4 power management control / status register - pmcsr (0x44; r/w) ...................................693 9.5.1.5 bridge support extensions - pmcsr_bse (0x46; ro)........................................................694 9.5.1.6 data register (0x47; ro)..............................................................................................69 4 9.5.2 msi configuration ......................................................................................................... .... 695 9.5.2.1 capability id register (0x50; ro) ..................................................................................695 9.5.2.2 next pointer register (0x51; ro) ...................................................................................695 9.5.2.3 message control register (0x52; r/w)............................................................................695 9.5.2.4 message address low register (0x54; r/w) ....................................................................696 9.5.2.5 message address high register (0x58; r/w) ...................................................................696 9.5.2.6 message data register (0x5c; r/w) ...............................................................................696 9.5.2.7 mask bits register (0x60; r/w) .....................................................................................696 9.5.2.8 pending bits register (0x64; r/w) ................................................................................696 9.5.3 msi-x configuration ....................................................................................................... ... 696 9.5.3.1 capability id register (0x70; ro) ..................................................................................697 9.5.3.2 next pointer register (0x71; ro) ...................................................................................697
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 33 9.5.3.3 message control register (0x72; r/w) ........................................................................... 697 9.5.3.4 table offset register (0x74; r/w).................................................................................. 698 9.5.3.5 pba offset register (0x78; r/w) ................................................................................... 698 9.5.4 vital product data registers.............................................................................................. ...699 9.5.4.1 capability id register (0xe0; ro) .................................................................................. 699 9.5.4.2 next pointer register (0xe1; ro) ................................................................................... 699 9.5.4.3 vpd address register (0xe2; rw) .................................................................................. 699 9.5.4.4 vpd data register (0xe4; rw) ...................................................................................... 699 9.5.5 pcie configuration registers.............................................................................................. ..700 9.5.5.1 capability id register (0xa0; ro) .................................................................................. 700 9.5.5.2 next pointer register (0xa1; ro) ................................................................................... 700 9.5.5.3 pcie cap register (0xa2; ro) ....................................................................................... 700 9.5.5.4 device capability register (0xa4; rw)............................................................................ 700 9.5.5.5 device control register (0xa8; rw) ............................................................................... 701 9.5.5.6 device status register (0xaa; rw1c)............................................................................. 703 9.5.5.7 link cap register (0xac; ro)........................................................................................ 704 9.5.5.8 link control register (0xb0; ro) ................................................................................... 705 9.5.5.9 link status register (0xb2; ro) .................................................................................... 706 9.5.5.10 reserved registers (0xb4-0xc0; ro) ............................................................................. 707 9.5.5.11 device cap 2 register (0xc4; ro).................................................................................. 707 9.5.5.12 device control 2 register (0xc8; rw) ............................................................................ 708 9.6 pcie extended configuration space ........................................................................................... 709 9.6.1 advanced error reporting (aer) capability ............................................................................710 9.6.1.1 pcie cap id register (0x100; ro) ................................................................................. 711 9.6.1.2 uncorrectable error status register (0x104; r/w1cs) ......................................................711 9.6.1.3 uncorrectable error mask register (0x108; rws) ............................................................. 712 9.6.1.4 uncorrectable error severity register (0x10c; rws) ........................................................ 712 9.6.1.5 correctable error status register (0x110; r/w1cs) ......................................................... 713 9.6.1.6 correctable error mask register (0x114; rws) ................................................................ 713 9.6.1.7 advanced error capabilities and control register (0x118; ro) ...........................................714 9.6.1.8 header log register (0x11c:0x128; ro)......................................................................... 714 9.6.2 serial number ............................................................................................................. ......714 9.6.2.1 device serial number enhanced capability header register (0x140; ro).............................714 9.6.2.2 serial number register (0x144:0x148; ro)..................................................................... 715 9.6.3 ari capability structure .................................................................................................. ....716 9.6.3.1 pcie ari header register (0x150; ro) ........................................................................... 717 9.6.3.2 pcie ari capabilities & control register (0x154; ro) .......................................................717 9.6.4 iov capability structure.................................................................................................. ....718 9.6.4.1 pcie sr-iov header register (0x160; ro) ...................................................................... 719 9.6.4.2 pcie sr-iov capabilities register (0x164; ro) ................................................................ 719 9.6.4.3 pcie sr-iov control register (0x168; rw) ..................................................................... 719 9.6.4.4 pcie sr-iov max/total vfs register (0x16c) .................................................................. 720 9.6.4.5 pcie sr-iov num vfs register (0x170; r/w).................................................................. 721 9.6.4.6 pcie sr-iov vf rid mapping register (0x174; ro).......................................................... 721 9.6.4.7 pcie sr-iov vf device id register (0x178; ro) .............................................................. 722 9.6.4.8 pcie sr-iov supported page size register (0x17c; ro) ...................................................722 9.6.4.9 pcie sr-iov system page size register (0x180; r/w) .....................................................723 9.6.4.10 pcie sr-iov bar 0 - low register (0x184; r/w) ............................................................. 723 9.6.4.11 pcie sr-iov bar 0 - high register (0x188; r/w) ............................................................ 723 9.6.4.12 pcie sr-iov bar 2 register (0x18c; ro) ....................................................................... 724 9.6.4.13 pcie sr-iov bar 3 - low register (0x190; r/w) ............................................................. 724 9.6.4.14 pcie sr-iov bar 3 - high register (0x194; r/w) ............................................................ 724 9.6.4.15 pcie sr-iov bar 5 register (0x198; ro) ....................................................................... 724 9.6.4.16 pcie sr-iov vf migration state array offset register (0x19c; ro) ....................................724 9.7 virtual functions (vf) configuration space.................................................................................. 725 9.7.1 legacy header details ..................................................................................................... ...727 9.7.1.1 vf command register (0x4; rw) ................................................................................... 727 9.7.1.2 vf status register (0x6; rw) ........................................................................................ 728 9.7.2 vf legacy capabilities.................................................................................................... .....728 9.7.2.1 vf msi-x capability ..................................................................................................... 728 9.7.2.1.1 vf msi-x control register (0x72; rw)......................................................................... 728
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 34 9.7.2.2 vf pcie capability registers ..........................................................................................72 9 9.7.2.2.1 vf device control register (0xa8; rw) ........................................................................729 9.7.2.2.2 vf device status register (0xaa; rw1c) .....................................................................729 9.7.2.3 vf advanced error reporting registers ...........................................................................730 9.7.2.3.1 vf uncorrectable error status register (0x104; r/w1cs) ...............................................730 9.7.2.3.2 vf correctable error status register (0x110; r/w1cs) ..................................................731 10.0 system manageability ............................................................................................................. 733 10.1 pass-through (pt) functionality ............................................................................................ .... 733 10.2 sideband packet routing .................................................................................................... ...... 734 10.3 components of the sideband interface ....................................................................................... 734 10.3.1 physical layer........................................................................................................... ........ 734 10.3.1.1 smbus .................................................................................................................. ......734 10.3.1.2 nc-si .................................................................................................................. .......734 10.3.2 logical layer ............................................................................................................ ........ 735 10.3.2.1 smbus .................................................................................................................. ......735 10.3.2.2 nc-si .................................................................................................................. .......735 10.4 packet filtering ........................................................................................................... ............ 735 10.4.1 manageability receive filtering.......................................................................................... .. 735 10.4.2 ethertype filters ........................................................................................................ ....... 737 10.4.3 l2 layer filtering ....................................................................................................... ....... 737 10.4.4 l3/l4 filtering .......................................................................................................... ........ 737 10.4.4.1 arp filtering .......................................................................................................... .....737 10.4.4.2 neighbor discovery filtering ..........................................................................................7 38 10.4.4.3 rmcp filtering ......................................................................................................... ....738 10.4.4.4 flexible port filtering ................................................................................................ ....738 10.4.4.5 flexible 128 byte filter ............................................................................................... ..738 10.4.4.5.1 flexible filter structure............................................................................................ ..738 10.4.4.5.2 tco filter programming.............................................................................................73 8 10.4.4.6 ip address filtering ................................................................................................... ...739 10.4.4.7 checksum filtering..................................................................................................... ..739 10.4.5 configuring manageability filters ........................................................................................ . 739 10.4.5.1 manageability decision filters (mdef) and extended manageability decision filters (mdef_ext) ......................................................................740 10.4.5.2 management to host filter ............................................................................................74 2 10.4.6 possible configurations .................................................................................................. .... 743 10.4.6.1 dedicated mac packet filtering ......................................................................................743 10.4.6.2 broadcast packet filtering ............................................................................................. 744 10.4.6.3 vlan packet filtering.................................................................................................. ..744 10.4.6.4 receive filtering with shared ip .....................................................................................74 4 10.4.7 determining manageability mac address............................................................................... 745 10.5 smbus pass-through interface ............................................................................................... ... 745 10.5.1 general.................................................................................................................. .......... 745 10.5.2 pass-through capabilities................................................................................................ ... 745 10.5.3 pass-through multi-port modes ........................................................................................... 7 46 10.5.4 automatic ethernet arp operation....................................................................................... 74 6 10.5.4.1 arp packet formats ..................................................................................................... 746 10.5.5 smbus transactions....................................................................................................... .... 748 10.5.5.1 smbus addressing....................................................................................................... .749 10.5.5.2 smbus arp functionality ...............................................................................................7 49 10.5.5.3 smbus arp flow ......................................................................................................... .749 10.5.5.4 smbus arp udid content .............................................................................................752 10.5.5.5 smbus arp in dual/single mode.....................................................................................753 10.5.5.6 concurrent smbus transactions .....................................................................................753 10.5.6 smbus notification methods ............................................................................................... . 754 10.5.6.1 smbus alert and alert response method .........................................................................754 10.5.6.2 asynchronous notify method..........................................................................................755 10.5.6.3 direct receive method .................................................................................................. 755 10.5.7 receive tco flow......................................................................................................... ..... 756 10.5.8 transmit tco flow ........................................................................................................ .... 756 10.5.8.1 transmit errors in sequence handling.............................................................................757
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 35 10.5.8.2 tco command aborted flow ......................................................................................... 757 10.5.9 smbus arp transactions ................................................................................................... ..758 10.5.9.1 prepare to arp ......................................................................................................... ... 758 10.5.9.2 reset device (general) ................................................................................................. 758 10.5.9.3 reset device (directed) ................................................................................................ 758 10.5.9.4 assign address ......................................................................................................... ... 758 10.5.9.5 get udid (general and directed) ................................................................................... 759 10.5.10 smbus pass-through transactions ........................................................................................7 61 10.5.10.1 write smbus transactions ............................................................................................. 7 61 10.5.10.1.1 transmit packet command......................................................................................... 761 10.5.10.1.2 request status command .......................................................................................... 761 10.5.10.1.3 receive enable command .......................................................................................... 762 10.5.10.1.3.1 management mac address (data bytes 7:2) .......................................................... 763 10.5.10.1.3.2 management ip address (data bytes 11:8) ............................................................ 763 10.5.10.1.3.3 asynchronous notification smbus address (data byte 12) ........................................763 10.5.10.1.3.4 interface data (data byte 13) .............................................................................. 763 10.5.10.1.3.5 alert value data (data byte 14) ........................................................................... 764 10.5.10.1.4 force tco command................................................................................................. 76 4 10.5.10.1.5 management control ................................................................................................. 7 64 10.5.10.1.5.1 update management receive filter parameters.......................................................765 10.5.10.1.6 update macsec parameters ....................................................................................... 767 10.5.10.2 read smbus transactions ............................................................................................. 76 9 10.5.10.2.1 receive tco lan packet transaction ........................................................................... 770 10.5.10.2.1.1 receive tco lan status payload transaction ......................................................... 771 10.5.10.2.2 read status command .............................................................................................. 773 10.5.10.2.3 get system mac address ........................................................................................... 775 10.5.10.2.4 read management parameters.................................................................................... 776 10.5.10.2.5 read management receive filter parameters ................................................................ 777 10.5.10.2.6 read receive enable configuration.............................................................................. 779 10.5.10.2.7 read macsec parameters .......................................................................................... 779 10.5.11 lan fail-over in lan teaming mode .....................................................................................78 2 10.5.11.1 fail-over functionality ............................................................................................... ... 782 10.5.11.1.1 transmit functionality .............................................................................................. . 782 10.5.11.1.2 receive functionality............................................................................................... .. 782 10.5.11.1.3 port switching (fail-over) .......................................................................................... 783 10.5.11.1.4 device driver interactions .......................................................................................... 783 10.5.11.2 fail-over configuration ............................................................................................... .. 783 10.5.11.2.1 preferred primary port .............................................................................................. . 783 10.5.11.2.2 gratuitous arps..................................................................................................... ... 783 10.5.11.2.3 link down timeout ................................................................................................... 784 10.5.11.3 fail-over register .................................................................................................... .... 784 10.5.12 example configuration steps ............................................................................................. ..785 10.5.12.1 example 1 - shared mac, rmcp only ports ...................................................................... 785 10.5.12.1.1 example 1 pseudo code............................................................................................. 78 5 10.5.12.2 example 2 - dedicated mac, auto arp response and rmcp port filtering ................................................................................................. 786 10.5.12.2.1 example 2 - pseudo code........................................................................................... 78 6 10.5.12.3 example 3 - dedicated mac & ip address ........................................................................ 788 10.5.12.3.1 example 3 - pseudo code........................................................................................... 78 9 10.5.12.4 example 4 - dedicated mac and vlan tag ..................................................................... 791 10.5.12.4.1 example 4 - pseudo code........................................................................................... 79 1 10.5.13 smbus troubleshooting ................................................................................................... ....793 10.5.13.1 tco alert line stays asserted after a power cycle ........................................................... 793 10.5.13.2 when smbus commands are always nack'd ................................................................... 793 10.5.13.3 smbus clock speed is 16.6666 khz ............................................................................... 794 10.5.13.4 a network based host application is not receiving any network packets .................................................................................................... 794 10.5.13.5 unable to transmit packets from the mc ......................................................................... 794 10.5.13.6 smbus fragment size ................................................................................................... 794 10.5.13.7 losing link........................................................................................................... ....... 795 10.5.13.8 enable xsum filtering ................................................................................................. . 796
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 36 10.5.13.9 still having problems? ................................................................................................ ..796 10.6 nc-si pass through interface ............................................................................................... .... 796 10.6.1 overview ................................................................................................................. ........ 796 10.6.1.1 terminology ............................................................................................................ ....796 10.6.1.2 system topology ........................................................................................................ .797 10.6.1.3 data transport ......................................................................................................... ...799 10.6.1.3.1 control frames ....................................................................................................... ..799 10.6.1.3.2 nc-si frames receive flow ........................................................................................799 10.6.2 nc-si support ............................................................................................................ ...... 800 10.6.2.1 supported features ..................................................................................................... .800 10.6.2.2 nc-si mode ? intel specific commands .........................................................................802 10.6.2.2.1 overview ............................................................................................................. ....802 10.6.2.2.2 oem command (0x50) ..............................................................................................803 10.6.2.2.3 oem response (0xd0) ...............................................................................................803 10.6.2.2.4 oem specific command response reason codes ...........................................................803 10.6.2.3 proprietary commands format.......................................................................................805 10.6.2.3.1 set intel filters control command (intel command 0x00) ...............................................................................................805 10.6.2.3.2 set intel filters control response format (intel command 0x00) ...............................................................................................806 10.6.2.4 set intel filters control ? ip filters control command (intel command 0x00, filter control index 0x00) .............................................................806 10.6.2.4.1 set intel filters control ? ip filters control response (intel command 0x00, filter control index 0x00) ..........................................................807 10.6.2.5 get intel filters control commands (intel command 0x01)..................................................................................................807 10.6.2.5.1 get intel filters control ? ip filters control command (intel command 0x01, filter control index 0x00) ..........................................................807 10.6.2.5.2 get intel filters control ? ip filters control response (intel command 0x01, filter control index 0x00) ..........................................................808 10.6.2.6 set intel filters formats.............................................................................................. ..808 10.6.2.6.1 set intel filters command (intel command 0x02) .........................................................808 10.6.2.6.2 set intel filters response (intel command 0x02) ..........................................................808 10.6.2.6.3 set intel filters ? manageability to host command (intel command 0x02, filter parameter 0x0a)...............................................................809 10.6.2.6.4 set intel filters ? manageability to host response (intel command 0x02, filter parameter 0x0a)...............................................................809 10.6.2.6.5 set intel filters ? flex filter 0 enable mask and length command (intel command 0x02, filter parameter 0x10/0x20/0x30/0x40) ......................................810 10.6.2.6.6 set intel filters ? flex filter 0 enable mask and length response (intel command 0x02, filter parameter 0x10/0x20/0x30/0x40) ......................................810 10.6.2.6.7 set intel filters ? flex filter 0 data command (intel command 0x02, filter parameter 0x11/0x21/0x31/0x41) ......................................810 10.6.2.6.8 set intel filters ? flex filter 0 data response (intel command 0x02, filter parameter 0x11/0x21/0x31/0x41) ......................................811 10.6.2.6.9 set intel filters ? packet addition decision filter command (intel command 0x02, filter parameter 0x61) ...............................................................811 10.6.2.6.10 set intel filters ? packet addition decision filter response (intel command 0x02, filter parameter 0x61) ...............................................................813 10.6.2.6.11 set intel filters ? flex tcp/udp port filter command (intel command 0x02, filter parameter 0x63) ...............................................................813 10.6.2.6.12 set intel filters ? flex tcp/udp port filter response (intel command 0x02, filter parameter 0x63) ...............................................................814 10.6.2.6.13 set intel filters ? ipv4 filter command (intel command 0x02, filter parameter 0x64) ...............................................................814 10.6.2.6.14 set intel filters ? ipv4 filter response (intel command 0x02, filter parameter 0x64) ...............................................................814 10.6.2.6.15 set intel filters ? ipv6 filter command (intel command 0x02, filter parameter 0x65) ...............................................................815 10.6.2.6.16 set intel filters ? ipv6 filter response (intel command 0x02, filter parameter 0x65) ...............................................................815
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 37 10.6.2.6.17 set intel filters - ethertype filter command (intel command 0x02, filter parameter 0x67)............................................................... 815 10.6.2.6.18 set intel filters - ethertype filter response (intel command 0x02, filter parameter 0x67)............................................................... 816 10.6.2.6.19 set intel filters - packet addition extended decision filter command (intel command 0x02, filter parameter 0x68)................................................816 10.6.2.6.20 set intel filters ? packet addition extended decision filter response (intel command 0x02, filter parameter 0x68).................................................818 10.6.2.7 get intel filters formats .............................................................................................. . 819 10.6.2.7.1 get intel filters command (intel command 0x03)......................................................... 819 10.6.2.7.2 get intel filters response (intel command 0x03).......................................................... 819 10.6.2.7.3 get intel filters ? manageability to host command (intel command 0x03, filter parameter 0x0a)............................................................... 819 10.6.2.7.4 get intel filters ? manageability to host response (intel command 0x03, filter parameter 0x0a)............................................................... 819 10.6.2.7.5 get intel filters ? flex filter 0 enable mask and length command (intel command 0x03, filter parameter 0x10/0x20/0x30/0x40) ......................................820 10.6.2.7.6 get intel filters ? flex filter 0 enable mask and length response (intel command 0x03, filter parameter 0x10/0x20/0x30/0x40) ......................................821 10.6.2.7.7 get intel filters ? flex filter 0 data command (intel command 0x03, filter parameter 0x11/0x21/0x31/0x41) ......................................821 10.6.2.7.8 get intel filters ? flex filter 0 data response (intel command 0x03, filter parameter 0x11)............................................................... 821 10.6.2.7.9 get intel filters ? packet addition decision filter command (intel command 0x03, filter parameter 0x61)............................................................... 822 10.6.2.7.10 get intel filters ? packet addition decision filter response (intel command 0x03, filter parameter 0x0a)............................................................... 822 10.6.2.7.11 get intel filters ? flex tcp/udp port filter command (intel command 0x03, filter parameter 0x63)............................................................... 822 10.6.2.7.12 get intel filters ? flex tcp/udp port filter response (intel command 0x03, filter parameter 0x63)............................................................... 823 10.6.2.7.13 get intel filters ? ipv4 filter command (intel command 0x03, filter parameter 0x64)............................................................... 823 10.6.2.7.14 get intel filters ? ipv4 filter response (intel command 0x03, filter parameter 0x64)............................................................... 823 10.6.2.7.15 get intel filters ? ipv6 filter command (intel command 0x03, filter parameter 0x65)............................................................... 824 10.6.2.7.16 get intel filters ? ipv6 filter response (intel command 0x03, filter parameter 0x65)............................................................... 824 10.6.2.8 set intel packet reduction filters formats....................................................................... 825 10.6.2.8.1 set intel packet reduction filters command (intel command 0x04) ............................................................................................... 825 10.6.2.8.2 set intel packet reduction filters response (intel command 0x04) ............................................................................................... 825 10.6.2.8.3 set unicast packet reduction command (intel command 0x04, reduction filter index 0x00).......................................................825 10.6.2.8.4 set unicast packet reduction response (intel command 0x04, reduction filter index 0x00).......................................................827 10.6.2.8.5 set multicast packet reduction command (intel command 0x04, reduction filter index 0x01).......................................................827 10.6.2.8.6 set multicast packet reduction response (intel command 0x04, reduction filter index 0x01).......................................................829 10.6.2.8.7 set broadcast packet reduction command (intel command 0x04, reduction filter index 0x02).......................................................829 10.6.2.8.8 set broadcast packet reduction response (intel command 0x08) ............................................................................................... 831 10.6.2.9 get intel packet reduction filters formats....................................................................... 831 10.6.2.9.1 get intel packet reduction filters command (intel command 0x05) ............................................................................................... 831 10.6.2.9.2 set intel packet reduction filters response (intel command 0x05) ............................................................................................... 831
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 38 10.6.2.9.3 get unicast packet reduction command (intel command 0x05, reduction filter index 0x00).......................................................832 10.6.2.9.4 get unicast packet reduction response (intel command 0x05, reduction filter index 0x00).......................................................832 10.6.2.9.5 get multicast packet reduction command (intel command 0x05, reduction filter index 0x01).......................................................832 10.6.2.9.6 get multicast packet reduction response (intel command 0x05, reduction filter index 0x01).......................................................832 10.6.2.9.7 get broadcast packet reduction command (intel command 0x05, reduction filter index 0x02).......................................................833 10.6.2.9.8 get broadcast packet reduction response (intel command 0x05, reduction filter index 0x02).......................................................833 10.6.2.10 system mac address.................................................................................................... 833 10.6.2.10.1 get system mac address command (intel command 0x06) ...............................................................................................833 10.6.2.10.2 get system mac address response (intel command 0x06) ...............................................................................................834 10.6.2.11 set intel management control formats ...........................................................................834 10.6.2.11.1 set intel management control command (intel command 0x20) ...............................................................................................834 10.6.2.11.2 set intel management control response (intel command 0x20) ...............................................................................................835 10.6.2.12 get intel management control formats ...........................................................................835 10.6.2.12.1 get intel management control command (intel command 0x21) ...............................................................................................835 10.6.2.12.2 get intel management control response (intel command 0x21) ...............................................................................................835 10.6.2.13 tco reset............................................................................................................. ......836 10.6.2.13.1 perform intel tco reset command (intel command 0x22) ...............................................................................................836 10.6.2.13.2 perform intel tco reset response (intel command 0x22)..............................................837 10.6.2.14 checksum offloading ................................................................................................... .837 10.6.2.14.1 enable checksum offloading command (intel command 0x23) ...............................................................................................837 10.6.2.14.2 enable checksum offloading response (intel command 0x23) ...............................................................................................837 10.6.2.14.3 disable checksum offloading command (intel command 0x24) ...............................................................................................838 10.6.2.14.4 disable checksum offloading response (intel command 0x24) ...............................................................................................838 10.6.2.15 macsec control commands format (intel command 0x30)................................................838 10.6.2.15.1 transfer macsec ownership to mc command (intel command 0x30, parameter 0x10) ......838 10.6.2.15.2 transfer macsec ownership to mc response (intel command 0x30, parameter 0x10) .......839 10.6.2.15.3 transfer macsec ownership to host command (intel command 0x30, parameter 0x11) ....839 10.6.2.15.4 transfer macsec ownership to host response (intel command 0x30, parameter 0x11) .....840 10.6.2.15.5 initialize macsec rx command (intel command 0x30, parameter 0x12)..........................840 10.6.2.15.6 initialize macsec rx response (intel command 0x30, parameter 0x12)...........................840 10.6.2.15.7 initialize macsec tx command (intel command 0x30, parameter 0x13) ..........................841 10.6.2.15.8 initialize macsec tx response (intel command 0x30, parameter 0x13) ...........................842 10.6.2.15.9 set macsec rx key command (intel command 0x30, parameter 0x14)...........................842 10.6.2.15.10 set macsec rx key response (intel command 0x30, parameter 0x14)............................843 10.6.2.15.11 set macsec tx key command (intel command 0x30, parameter 0x15) ...........................843 10.6.2.15.12 set macsec tx key response (intel command 0x30, parameter 0x15) ............................843 10.6.2.15.13 enable network tx encryption command (intel command 0x30, parameter 0x16) ............844 10.6.2.15.14 enable network tx encryption response (intel command 0x30, parameter 0x16) .............844 10.6.2.15.15 disable network tx encryption command (intel command 0x30, parameter 0x17)............845 10.6.2.15.16 disable network tx encryption response (intel command 0x30, parameter 0x17) ............845 10.6.2.15.17 enable network rx decryption command (intel command 0x30, parameter 0x18) ............845 10.6.2.15.18 enable network rx decryption response (intel command 0x30, parameter 0x18).............846 10.6.2.15.19 disable network rx decryption command (intel command 0x30, parameter 0x19) ...........846 10.6.2.15.20 disable network rx decryption response (intel command 0x30, parameter 0x19) ............846
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 39 10.6.2.15.21 get macsec parameters format (intel command 0x31)..................................................846 10.6.2.15.22 get macsec rx parameters command (intel command 0x31, parameter 0x01) ................847 10.6.2.15.23 get macsec rx parameters response (intel command 0x31, parameter 0x01).................847 10.6.2.15.24 get macsec tx parameters command (intel command 0x31, parameter 0x02) ................848 10.6.2.15.25 get macsec tx parameters response (intel command 0x31, parameter 0x02) .................848 10.6.2.16 macsec aen (intel aen 0x80) ....................................................................................... 849 10.6.3 basic nc-si workflows.................................................................................................... ....850 10.6.3.1 package states ......................................................................................................... ... 850 10.6.3.2 channel states ......................................................................................................... ... 850 10.6.3.3 discovery.............................................................................................................. ...... 850 10.6.3.4 configurations ......................................................................................................... .... 851 10.6.3.4.1 nc capabilities advertisement .................................................................................... 851 10.6.3.4.2 receive filtering .................................................................................................... ... 851 10.6.3.4.2.1 mac address filtering ......................................................................................... 851 10.6.3.4.3 vlan................................................................................................................. ...... 852 10.6.3.5 pass-through traffic states ........................................................................................... 8 53 10.6.3.6 channel enable......................................................................................................... ... 853 10.6.3.7 network transmit enable .............................................................................................. 85 3 10.6.4 asynchronous event notifications ......................................................................................... 853 10.6.5 querying active parameters............................................................................................... ..854 10.6.6 resets ................................................................................................................... ...........854 10.6.7 advanced workflows....................................................................................................... ....854 10.6.7.1 multi-nc arbitration ................................................................................................... .. 854 10.6.7.2 package selection sequence example ............................................................................. 855 10.6.7.3 external link control .................................................................................................. .. 856 10.6.7.4 set link while lan pcie functionality is disabled ............................................................. 856 10.6.7.5 multiple channels (fail-over)......................................................................................... 8 56 10.6.7.5.1 fail-over algorithm example ...................................................................................... 857 10.6.7.6 statistics ............................................................................................................. ....... 857 10.7 manageability host interface ............................................................................................... ...... 858 10.7.1 host csr interface (function 1/0) ......................................................................................85 8 10.7.2 host slave command interface to manageability ....................................................................858 10.7.3 host slave command interface low level flow ......................................................................858 10.7.4 host slave command registers ............................................................................................8 59 10.7.4.1 host interface control register (csr address 0x8f00; aux 0x0700)............................................................................... 859 10.7.4.2 firmware status 0 (fws0r) register (csr address 0x8f0c; aux 0x0702) .............................................................................. 859 10.7.4.3 software status register (csr address 0x8f10; aux 0x0703) ...........................................859 10.7.5 host interface command structure.......................................................................................85 9 10.7.6 host interface status structure .......................................................................................... ..860 10.7.7 checksum calculation algorithm........................................................................................... 860 10.7.8 host slave interface commands...........................................................................................8 60 10.7.9 fail-over configuration host command .................................................................................860 10.7.10 read fail-over configuration host command .........................................................................861 10.8 macsec and manageability ................................................................................................... .... 862 10.8.1 handover of macsec responsibility between mc and host .......................................................863 10.8.1.1 kay ownership release by the host................................................................................ 863 10.8.1.2 kay ownership takeover by bmc ................................................................................... 863 10.8.1.3 kay ownership request by the host ............................................................................... 863 10.8.1.4 kay ownership release by bmc ..................................................................................... 864 10.8.1.5 control registers ...................................................................................................... ... 865 10.8.2 filtering of non-macsec packets .......................................................................................... 866 10.8.3 sending of clear packets in a macsec environment .................................................................866 11.0 electrical / mechanical specification ........................................................................................867 11.1 introduction............................................................................................................... ............. 867 11.2 operating conditions ....................................................................................................... ........ 868 11.2.1 recommended operating conditions .....................................................................................868 11.3 power delivery ............................................................................................................. ........... 868 11.3.1 power supply specification ............................................................................................... ...868
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 40 11.3.1.1 power on/off sequence ................................................................................................87 0 11.4 dc/ac specification ........................................................................................................ ......... 871 11.4.1 ball summary ............................................................................................................. ...... 871 11.4.2 dc specifications ........................................................................................................ ....... 871 11.4.2.1 current consumption.................................................................................................... 871 11.4.2.2 digital i/o............................................................................................................ .......874 11.4.2.3 open drain i/os ........................................................................................................ ..875 11.4.2.4 nc-si input and output pads ........................................................................................876 11.4.3 digital i/f ac specifications ............................................................................................ .... 876 11.4.3.1 digital i/o ac specifications .......................................................................................... 876 11.4.3.2 reset signals .......................................................................................................... .....878 11.4.3.2.1 internal_power_on_reset ..........................................................................................878 11.4.3.3 smbus .................................................................................................................. ......879 11.4.3.4 flash ac specification ................................................................................................. 880 11.4.3.5 eeprom ac specification ..............................................................................................88 1 11.4.3.6 nc-si ac specification................................................................................................. .882 11.4.3.7 jtag ac specification .................................................................................................. .883 11.4.3.8 mdio ac specification .................................................................................................. 884 11.4.3.9 sfp 2 wires i/f ac specification ....................................................................................885 11.4.3.10 pcie/serdes dc/ac specification ...................................................................................885 11.4.3.11 pcie specification - receiver ......................................................................................... 885 11.4.3.12 pcie specification - transmitter .....................................................................................8 86 11.4.3.13 pcie specification - input clock .....................................................................................8 86 11.4.4 serdes dc/ac specification ............................................................................................... . 886 11.4.4.1 serdes specification - receiver ......................................................................................88 6 11.4.4.2 serdes specification - transmitter ..................................................................................886 11.4.4.3 serdes specification -input clock ...................................................................................886 11.4.5 phy specification........................................................................................................ ....... 886 11.4.6 xtal/clock specification ................................................................................................. ... 886 11.4.6.1 crystal specification .................................................................................................. ...886 11.4.6.2 external clock oscillator specification .............................................................................887 11.4.7 rbias connection ......................................................................................................... ..... 888 11.5 eeprom flash devices ....................................................................................................... ...... 889 11.5.1 flash .................................................................................................................... ........... 889 11.5.2 eeprom device options .................................................................................................... . 890 11.6 package information ........................................................................................................ ........ 890 11.6.1 mechanical ............................................................................................................... ........ 890 11.6.2 intel? 82576 gbe controller package .................................................................................. 891 11.6.2.1 package schematics ..................................................................................................... 891 12.0 design guidelines .................................................................................................................... 901 12.1 82575/82576 ................................................................................................................ .......... 901 12.1.1 pin out compatibility .................................................................................................... ..... 901 12.1.1.1 printed circuit board requirements ................................................................................902 12.1.1.2 82576 design ........................................................................................................... ...902 12.1.1.3 82575 design ........................................................................................................... ...902 12.2 port connection to the device .............................................................................................. ..... 902 12.2.1 pcie reference clock ..................................................................................................... .... 902 12.2.2 other pcie signals ....................................................................................................... ..... 903 12.2.3 physical layer features .................................................................................................. .... 903 12.2.3.1 link width configuration ............................................................................................... 903 12.2.3.2 polarity inversion ..................................................................................................... ....903 12.2.3.3 lane reversal.......................................................................................................... ....903 12.2.4 pcie routing ............................................................................................................. ....... 904 12.3 ethernet component design guidelines ...................................................................................... 9 04 12.3.1 general design considerations for ethernet controllers .......................................................... 904 12.3.1.1 clock source ........................................................................................................... ....905 12.3.1.2 magnetics for 1000 base-t ...........................................................................................905 12.3.1.2.1 magnetics module qualification steps...........................................................................905 12.3.1.2.2 magnetics module for 1000 base-t ethernet.................................................................905 12.3.1.2.3 third-party magnetics manufacturers ...........................................................................906
contents ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 41 12.3.1.2.4 layout guidelines for use with integrated and discrete magnetics ...................................906 12.3.2 designing with the 82576 ................................................................................................. ...906 12.3.2.1 lan disable ............................................................................................................ .... 906 12.3.2.2 serial eeprom.......................................................................................................... ... 907 12.3.2.2.1 eeprom-less operation ............................................................................................. 907 12.3.2.2.2 spi eeproms .......................................................................................................... . 907 12.3.2.2.3 eeupdate ............................................................................................................. .. 907 12.3.2.3 flash .................................................................................................................. ...... 907 12.3.2.3.1 flash device information.......................................................................................... 907 12.3.3 smbus and nc-si.......................................................................................................... .....907 12.3.4 nc-si electrical interface requirements ................................................................................90 8 12.3.4.1 external baseboard management controller (bmc) ........................................................... 909 12.3.4.2 schematic showing pull-ups and pull-downs for nc-si interface.........................................909 12.3.4.3 resets ................................................................................................................. ....... 910 12.3.4.4 layout requirements.................................................................................................... 910 12.3.4.4.1 board impedance...................................................................................................... 910 12.3.4.4.2 trace length restrictions ........................................................................................... 9 11 12.3.5 power supplies for the intel ? 82576eb gbe controller ............................................................912 12.3.5.1 power sequencing....................................................................................................... . 914 12.3.5.1.1 using regulators with enable pins............................................................................... 915 12.3.5.2 device power supply filtering ........................................................................................ 91 5 12.3.5.3 power management and wake up .................................................................................. 916 12.3.6 device test capability................................................................................................... ......916 12.3.7 software-definable pins (sdps)........................................................................................... .916 12.4 frequency control device design considerations ......................................................................... 917 12.4.1 frequency control component types ....................................................................................917 12.4.1.1 quartz crystal ......................................................................................................... .... 917 12.4.1.2 fixed crystal oscillator ............................................................................................... .. 917 12.4.1.3 programmable crystal oscillators ................................................................................... 917 12.4.1.4 ceramic resonator ...................................................................................................... . 918 12.5 crystal selection parameters............................................................................................... ...... 918 12.5.1 vibrational mode ......................................................................................................... .......918 12.5.2 nominal frequency........................................................................................................ .....919 12.5.3 frequency tolerance...................................................................................................... .....919 12.5.4 temperature stability and environmental requirements ..........................................................919 12.5.5 calibration mode ......................................................................................................... .......919 12.5.6 load capacitance ......................................................................................................... ......920 12.5.7 shunt capacitance ........................................................................................................ .....920 12.5.8 equivalent series resistance............................................................................................. ...921 12.5.9 drive level .............................................................................................................. ..........921 12.5.10 aging ................................................................................................................... ............921 12.5.11 reference crystal ....................................................................................................... ........921 12.5.11.1 reference crystal selection ........................................................................................... 921 12.5.11.2 circuit board ......................................................................................................... ...... 922 12.5.11.3 temperature changes .................................................................................................. 9 22 12.6 oscillator support......................................................................................................... ........... 922 12.6.1 oscillator solution ...................................................................................................... ........923 12.7 ethernet component layout guidelines ...................................................................................... 9 24 12.7.1 layout considerations.................................................................................................... .....924 12.7.1.1 guidelines for component placement .............................................................................. 924 12.7.1.2 crystals and oscillators............................................................................................... .. 927 12.7.1.2.1 crystal layout considerations ...................................................................................... 92 7 12.7.1.3 board stack up recommendations ................................................................................. 928 12.7.1.4 differential pair trace routing for 10/100/1000 designs....................................................928 12.7.1.4.1 signal termination and coupling ................................................................................. 929 12.7.1.5 signal trace geometry for 1000 base-t designs.............................................................. 930 12.7.1.6 trace length and symmetry for 1000 base-t designs ......................................................930 12.7.1.6.1 signal detect........................................................................................................ .... 930 12.7.1.7 routing 1.8 v to the magnetics center tap ...................................................................... 930 12.7.1.8 impedance discontinuities............................................................................................. 9 31 12.7.1.9 reducing circuit inductance .......................................................................................... 93 1
intel ? 82576eb gbe controller ? contents intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 42 12.7.1.10 signal isolation...................................................................................................... ......931 12.7.1.11 power and ground planes..............................................................................................9 31 12.7.1.12 traces for decoupling capacitors....................................................................................93 2 12.7.1.13 light emitting diodes for designs based on the 82576 ......................................................932 12.7.1.14 thermal design considerations ......................................................................................932 12.7.2 physical layer conformance testing .................................................................................... 932 12.7.2.1 conformance tests for 10/100/1000 mbps designs...........................................................932 12.7.3 troubleshooting common physical layout issues ................................................................... 933 12.8 serdes implementation ...................................................................................................... ...... 933 12.8.1 connecting the serdes interface.......................................................................................... 934 12.8.2 output voltage adjustment ................................................................................................ . 934 12.8.3 output voltage adjustment................................................................................................ . 935 12.9 thermal management ......................................................................................................... ..... 935 12.10 reference schematics ...................................................................................................... ........ 935 12.11 checklists ................................................................................................................ ............... 935 12.12 symbols ................................................................................................................... .............. 935 13.0 thermal design specifications ................................................................................................. 937 13.1 product package thermal specification ...................................................................................... . 937 13.2 introduction............................................................................................................... ............. 937 13.3 measuring the thermal conditions ........................................................................................... .. 938 13.4 thermal considerations ..................................................................................................... ....... 938 13.5 packaging terminology...................................................................................................... ....... 938 13.6 thermal specifications ..................................................................................................... ........ 939 13.6.1 case temperature ......................................................................................................... .... 939 13.7 thermal attributes......................................................................................................... .......... 940 13.7.1 designing for thermal performance ..................................................................................... 940 13.7.2 typical system definitions ............................................................................................... ... 940 13.7.3 package thermal characteristics ......................................................................................... 9 40 13.7.4 clearance................................................................................................................ ......... 942 13.7.5 default enhanced thermal solution...................................................................................... 94 3 13.7.6 extruded heat sinks...................................................................................................... ..... 943 13.7.7 attaching the extruded heat sink......................................................................................... 943 13.7.7.1 clips................................................................................................................. .........943 13.7.7.2 thermal interface material (pcm45f) ..............................................................................944 13.7.8 reliability .............................................................................................................. ........... 945 13.7.9 thermal interface management for heat-sink solutions.......................................................... 945 13.7.9.1 bond line management................................................................................................94 6 13.7.9.2 interface material performance ......................................................................................946 13.7.9.2.1 thermal resistance of material ...................................................................................946 13.7.9.2.2 wetting/filling characteristics of material .....................................................................946 13.8 measurements for thermal specifications.................................................................................... 946 13.8.1 case temperature measurements ........................................................................................ 946 13.8.1.1 attaching the thermocouple (no heat sink)....................................................................947 13.8.1.2 attaching the thermocouple (heat sink) .........................................................................947 13.9 heat sink and attach suppliers............................................................................................ ..... 948 13.10 pcb guidelines ............................................................................................................ ............ 948 14.0 diagnostics ............................................................................................................................. 95 1 14.1 jtag test mode description ................................................................................................. ..... 951 15.0 models, symbols, testing options, schematics and checklists ................................................. 953 15.1 models and symbols ......................................................................................................... ....... 953 15.2 physical layer conformance testing......................................................................................... .. 953 15.3 schematics ................................................................................................................. ............ 953 15.4 checklists ................................................................................................................. .............. 953 appendix a. changes from the 82575............................................................................................ 95 5
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 43 1.0 introduction the intel? 82576 gbe controller is a single, compact, low power component that offers two fully- integrated gigabit ethernet media access control (mac) and physical layer (phy) ports. this device uses the pcie* v2.0 (2.5gt/s). the 82576 enables two-port implementation in a relatively small area and can be used for server system configurations such as rack mounted or pedestal servers, where the 82576 can be used as add-on nic or lan on motherboard (lom) design. another system configuration is blade servers, where it can be used as lom. the 82576 can also be used in embedded applications such as switch add-on cards and network appliances. figure 1-1. 82576 block diagram
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 44 1.1 scope this document presents the external architecture (including device operation, pin descriptions, register definitions, etc.) for the 82576, a dual 10/100/1000 lan controller. this document is intended to be a reference for software device driver developers, board designers, test engineers, or others who may need specific technical or programming information. 1.2 terminology and acronyms table 1-1. glossary definition meaning 1000base-bx 1000base-bx is the picmg 3.1 electrical specification for transmission of 1 gb/s ethernet or 1 gb/s fibre channel encoded data over the backplane. 1000base-cx 1000base-x over specialty shielded 150 ? balanced copper jumper cable assemblies as specified in ieee 802.3 clause 39. 1000base-t 1000base-t is the specification for 1 gb/s ethernet over category 5e twisted pair cables as defined in ieee 802.3 clause 40. ah ip authentication header - an ipsec header providing authentication capabilities defined in rfc 4302. 1 b/w bandwidth. bios basic input/output system. bmc baseboard management controller. bt byte time. bwg bandwidth group. ca secure connectivity association (ca): a security relationship, established and maintained by key agreement protocols. this comprises a fully connected subset of the service access points in stations attached to a single lan that are to be supported by macsec. cpid congestion point identifier. cts cisco trusted security dca intel? quickdata (direct cache access). dfp deficit fixed priority. dft design for testability. dq descriptor queue. eeprom electrically erasable programmable memory. a non-volatile memory located on the lan controller that is directly accessible from the host. eop end of packet.
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 45 esp ip encapsulating security payload - an ipsec header providing encryption and authentication capabilities defined in rfc 4303. 1 fc flow control. firmware (fw) embedded code on the lan controller that is responsible for the implementation of the nc-si protocol and pass through functionality. host interface ram on the lan controller that is shared between the firmware and the host. ram is used to pass commands from the host to firmware and responses from the firmware to the host. hpc high - performance computing. ipc inter processor communication. ipg inter packet gap. lan (auxiliary power-up) the event of connecting the lan controller to a power source (occurs even before system power-up). lom lan on motherboard. lso large send offload. mac media access control. mdio management data input/output interface over mdc/mdio lines. mifs/mipg minimum inter frame spacing/minimum inter packet gap. mmw maximum memory window. mss maximum segment size. nic network interface controller. pcs physical coding sub layer. pf physical function (in a virtualization context). phy physical layer device. pma physical medium attachment. pmd physical medium dependent. pn (in a macsec context) packet number (pn): a monotonically increasing value used to uniquely identify a macsec frame in the sequence of frames transmitted using an sa. nc-si (type c) reduced media independent interface (reduced mii). sa source address. sa (in a macsec context) secure association (sa): a security relationship that provides security guarantees for frames transmitted from one member of a ca to the others. each sa is supported by a single secret key, or a single set of keys where the cryptographic operations used to protect one frame require more than one key. table 1-1. glossary (continued) definition meaning
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 46 1.2.1 external specification and documents the 82576 implements features from the following specifications. 1.2.1.1 network interface documents 1. ieee standard 802.3, 2005 edition (ethernet). incorporates various ieee standards previously published separately. institute of electrical and electronic engineers (ieee). 2. ieee standard 1149.1, 2001 edition (jtag). institute of electrical and electronics engineers (ieee) 3. ieee standard 802.1q for vlan 4. picmg3.1 ethernet/fibre channel over picmg 3.0 draft specification january 14, 2003 version d1.0 5. serial-gmii specification, cisco systems document eng-46158, revision 1.7. 6. inf-8074i specification for sfp (small formfactor pluggable) transceiver (ftp://ftp.seagate.com/ sff) sc secure channel (sc): a security relationship used to provide security guarantees for frames transmitted from one member of a ca to the others. an sc is supported by a sequence of sas thus allowing the periodic use of fresh keys without terminating the relationship. sci a globally unique identifier for a secure channel, comprising a globally unique mac address and a port identifier, unique within the system allocated that address. sdp software defined pins. serdes serializer and de-serializer circuit. sfd start frame delimiter. sgmii serialized gigabit media independent interface. smbus system management bus. a bus that carries various manageability components, including the lan controller, bios, sensors and remote-control devices. trl transmit rate limiting or transmit rate limiter, according to the context. tso transmit segmentation offload - a mode in which a large tcp/ udp i/o is handled to the device and the device segments it to l2 packets according to the requested mss. vf virtual function. vm virtual machine. vpd vital product data (pci protocol). 1. the ipsec function is present in the 82576eb sku. ipsec is removed from the 82576ns sku. table 1-1. glossary (continued) definition meaning
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 47 1.2.1.2 host interface documents 1. pci-express 2.0 base specification, revision 1.0 2. pci specification, version 3.0 3. pci bus power management interface specification, rev. 1.2, march 2004 4. advanced configuration and power interface specification, rev 2.0b, october 2002 1.2.1.3 virtualization documents 1. pci-express single root i/o virtualization and sharing specification rev 0.9 2. pci sig alternative routing-id interpretation (ari) ecn (http://teamsites.ch.ith.intel.com/sites/ pasdpa/pcie/pci%20express%20product_spec%20coordination/pages/ pcisig%20wip%20docs.aspx) 1.2.1.4 networking protocol documents 1. ipv4 specification (rfc 791) 2. ipv6 specification (rfc 2460) 3. tcp/udp specification (rfc 793/768) 4. sctp specification (rfc 2960) 5. arp specification (rfc 826) 6. eui-64 specification, http://standards.ieee.org/regauth/oui/tutorials/eui64.html. 1.2.1.5 manageability documents 1. dmtf network controller sideband interface (nc-si) specification rev 0.7. this product is type c. 2. system management bus (smbus) specification, sbs implementers forum, ver. 2.0, august 2000 1.2.1.6 security documents 1. ieee p802.1ae/d5.1 ? draft standard for local and metropolitan area networks ? media access control (mac) security. 2. the use of galois/counter mode (gcm) in ipsec encapsulating security payload (esp) (rfc 4106) 3. ip authentication header (ah) (rfc 4302) 4. ip encapsulating security payload (esp) (rfc 4303) 5. the use of galois message authentication code (gmac) in ipsec esp and ah (rfc 4543). 1.2.2 intel application notes 1. intel? ethernet controllers loopback modes - application note. 1.2.3 reference schematics reference schematics (serdes\fiber\sfp and copper) are available as a separate document through intel documentation channels.
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 48 1.2.4 checklists the schematic checklist and the layout and placement checklist are available as a separate document through intel documentation channels. 1.3 product overview the 82576 supports 2 ports with either an internal phy or a serdes or sgmii port which may connected to an external phy or directly to a blade connection for mac to mac communication. 1.3.1 system configurations the 82576 targets server system configurations such as rack mounted or pedestal servers, where the 82576 can be used as add-on nic or lan on motherboard (lom) design. another system configuration is blade servers, where it can be used as lom. the 82576 can also be used in embedded applications such as switch add-on cards and network appliances. 1.4 external interface 1.4.1 pcie* interface the pcie v2.0 (2.5gt/s) interface is used by the 82576 as a host interface. it supports x4, x2 and x1 configurations, while each lane runs at 2.5 ghz speed. the maximum aggregated raw bandwidth for a typical x4 configuration is 8 gb/s in each direction. see chapter 2.0 for a full description. the timing characteristics of this interface are defined in pci express card electromechanical specification rev 1.0 and in the pcie v2.0 (2.5gt/s) specification. 1.4.2 network interfaces two independent interfaces are used to connect the two 82576 ports to external devices. the following protocols are supported: ? 10base-t and 100base-t. ? 1000base-t interface to attach directly to a cat 5e wire. ? serdes interface to connect over a backplane to another serdes compliant device or to an optic module. ? sgmii interface to attach to an external phy, either on board or via an sfp module. the sgmii shares the same interface as the serdes. ? mdi (copper) support for standard ieee 802.3 ethernet interface for 1000base-t, 100base-tx, and 10base-t applications (802.3, 802.3u, and 802.3ab). see section 2.1.8.2 and section 2.1.6 for full pin description; section 11.4.4.1 to section 11.4.4.3 for timing characteristics of this interface.
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 49 1.4.3 eeprom interface the 82576 uses an eeprom device for storing product configuration information. several words of the eeprom are accessed automatically by the 82576 after reset in order to provide pre-boot configuration data that must be available to the 82576 before it is accessed by host software. the remainder of the stored information is accessed by various software modules used to report product configuration, serial number, etc. the 82576 is intended for use with an spi (4-wire) serial eeprom device such as an at25040an or compatible. see section 2.1.2 for full pin description and section 11.4.3.5 for timing characteristics of this interface. the 82576 also supports an eeprom-less mode, where all of the setup is done by software. 1.4.4 serial flash interface the 82576 provides an external spi serial interface to a flash or boot rom device such as the atmel* at25f1024 or at25fb512. the 82576 supports serial flash devices with up to 64 mb (8 mb) of memory. the size of the flash used by the 82576 can be configured by the eeprom. see section 2.1.2 for full pin description and section 11.4.3.4 for timing characteristics of this interface. note: though the 82576 supports devices with up to 8 mb of memory, bigger devices can also be used. accesses to memory beyond the flash device size results in access wrapping as only the lower address bits are used by the flash device. 1.4.5 smbus interface smbus is an optional interface for pass-through and/or configuration traffic between a mc and the 82576. the 82576's smbus interface can be configured to support both slow and fast timing modes. see section 2.1.3 for full pin description and section 11.4.3.3 for timing characteristics of this interface. 1.4.6 nc-si interface nc-si and smbus interfaces are optional for pass-through and/or configuration traffic between a mc and the 82576. the nc-si interface meets the dmtf nc-si specification, rev. 1.0.0.a. 1.4.7 mdio/2 wires interfaces the 82576 implements two management interfaces for control of an optional external phy. each interface can be either a 2 wires interface used to control an sfp module or mdio/mdc management interface for control plane connection between the mac and phy devices (master side). this interface provides the mac and software with the ability to monitor and control the state of the phy. the 82576 supports the data formats of 802.3 clause 22. each mdio interface should be connected to the relevant phy. see section 2.1.7 for full pin description and section 11.4.3.9 for timing characteristics of this interface.
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 50 1.4.8 software-definable pins (sdp) interface (general-purpose i/o) the 82576 has four software-defined pins (sdp pins) per port that can be used for miscellaneous hardware or software-control purposes. these pins can be individually configurable to act as either input or output pins. the default direction of each pin is configurable via the eeprom (see section 6.2.8 and section 6.2.9 ), as well as the default value of all pins configured as outputs. to avoid signal contention, all pins are set as input pins until the eeprom configuration is loaded. all four of the sdp pins can be configured for use as general-purpose interrupt (gpi) inputs. to act as gpi pins, the desired pins must be configured as inputs. a corresponding gpi interrupt-detection enable bit is then used to enable rising-edge detection of the input pin (rising-edge detection occurs by comparing values sampled at the internal clock rate, as opposed to an edge-detection circuit). when detected, a corresponding gpi interrupt is indicated in the interrupt cause register. the use, direction, and values of sdp pins are controlled and accessed using fields in the device control (ctrl) register and extended device control (ctrl_ext) register. see section 2.1.5 for full pin description of this interface. 1.4.9 leds interface the 82576 implements four output drivers per port intended for driving external led circuits. each of the four led outputs can be individually configured to select the particular event, state, or activity, which is indicated on that output. in addition, each led can be individually configured for output polarity as well as for blinking versus non-blinking (steady-state) indication. the configuration for led outputs is specified via the ledctl register. furthermore, the hardware- default configuration for all led outputs can be specified via eeprom fields (see section 6.2.19 and section 6.2.21 ), thereby supporting led displays configurable to a particular oem preference. see section 2.1.8.1 for full pin description of this interface. see section 7.5 for more detailed description of led behavior. 1.5 comparing product features the following tables compare features of similar intel components. table 1-2. 82576 features feature 82576 82575 82571eb number of ports 222 serial flash interface y y y 4-wire spi eeprom interface y y y configurable led operation for software or oem custom-tailoring of led displays yyy protected eeprom space for private configuration y y y
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 51 * device disable capability y y y package size (mm x mm) 25x25 25x25 17x17 watchdog timer yyn table 1-3. 82576 network features feature 82576 82575 82571eb half duplex at 10/100 mb/s operation and full duplex operation at all supported speeds yyy 10/100/1000 copper phy integrated on-chip y y y jumbo frames supported y y y max size of jumbo frames supported 9500 bytes 9500 bytes 9000 bytes flow control support: send/receive pause frames and receive fifo thresholds yyy statistics for management and rmon y y y 802.1q vlan support y y y serdes interface for external phy connection or system interconnect y y y sgmii interface for embedded applications y y n fiber/copper auto-sense* y y n serdes support of non-auto-negotiation partner y y n serdes signal detect y y n table 1-4. 82576 host interface features (sheet 1 of 2) feature 82576 82575 82571eb pcie revision 2.0 2.0 1.0a pcie physical layer (2.5 gt/s) 2.5 gt/ s) 2.5 gt/s) bus width x1, x2, x4 x1, x2, x4 x1, x2, x4 64-bit address support for systems using more than 4 gb of physical memory yy y outstanding requests for tx buffers 4 4 4 outstanding requests for tx descriptors 1 1 1 outstanding requests for rx descriptors 1 1 1 credits for posted writes 2 2 2 max payload size supported 512 b 256 b 256 b table 1-2. 82576 features (continued)
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 52 max request size supported 512 b 512 b 256 b link layer retry buffer size 2 kb 2 kb 2 kb vital product data (vpd) y n n table 1-5. 82576 lan functions features feature 82576 82575 82571eb programmable host memory receive buffers y y y descriptor ring management hardware for transmit and receive y y y acpi register set and power down functionality supporting d0 & d3 states yy y software controlled global reset bit (resets everything except the configuration registers) yy y software definable pins (sdp) - per port 4 4 4 four sdp pins can be configured as general purpose interrupts y y only 2 wake up yy y ipv6 wake-up filters y y y configurable (through the eeprom) flexible filter y y y default configuration by the eeprom for all leds for pre-driver functionality yy y lan function disable capability y y y programmable memory transmit buffers (up to 32 kb) y y y double vlan yy n ieee 1588 yn n table 1-6. 82576 lan performance features feature 82576 82575 82571eb tcp segmentation offload up to 256 kb yy y transmit rate limiting (trl) y n n ipv6 support for ip/tcp and ip/udp receive checksum offload y y y fragmented udp checksum offload for packet reassembly y y y message signaled interrupts (msi) y y y message signaled interrupts (msi-x) y y n packet interrupt coalescing timers (packet timers) and absolute- delay interrupt timers for both transmit and receive operation yn n interrupt throttling control to limit maximum interrupt rate and improve cpu utilization yy y table 1-4. 82576 host interface features (sheet 2 of 2)
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 53 rx packet split header y y y receive side scaling (rss) number of queues up to 16 4 2 total number of rx queues per port 16 4 2 total number of tx queues per port 16 4 2 rx header replication low latency interrupt dca support tcp timer interrupts no snoop relax ordering yes to all yes to all y n n n y y tso interleaving for reduced latency y n n receive side coalescing n n n sctp receive and transmit checksum offload y n n udp tso yn n table 1-7. 82576 virtualization features feature 82576 82575 82571eb support for virtual machines device queues (vmdq) 8 pools 4 n pci-sig sr iov 8 vf n n multicast/broadcast packet replication y n n vm to vm packet forwarding y n n traffic shaping y n n mac addresses 24 16 15 mac and vlan anti-spoofing y n n vlan filtering per pool global global per-pool statistics y n n per-pool off loads y partial n per-pool jumbo support y n n mirroring rules 4 0 0 table 1-8. 82576 manageability features feature 82576 82575 82571eb advanced pass-through-compatible management packet transmit/ receive support yy y manageability support for asf 1.0 and alert on lan 2.0 n y y table 1-6. 82576 lan performance features
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 54 * 1.6 overview of new capabilities the following section describes features added in intel? 82576 gbe controller that are new related to 82575. 1.6.1 ipsec off load for flows note: the ipsec function is present in the 82576eb sku. ipsec is removed from the 82576ns sku. the 82576 (sku: 82576eb) supports ipsec off load for a given number of flows. it is the operating system?s responsibility to submit to hardware the most loaded flows, in order to take maximum benefits of the ipsec off-load in terms of cpu utilization savings. main features are: smbus interface to external bmc y y y dmtf nc-si protocol standards support y y n l2 address filters 4 4 1 vlan l2 filters 8 8 4 flex l3 port filters 16 16 3 flex tco filters 4 4 2 l3 address filters (ipv4) 4 4 4 l3 address filters (ipv6) 4 4 1 table 1-9. 82576 security features feature 82576 82575 82571eb integrated macsec security engines ? gcm aes 128 encryption or authentication engine. ? one secure connection ? two security associations. ? replay protection with zero window. ynn integrated ipsec offload engine 1 ? security associations - rx ? security associations - tx ? ip authentication header (ah) protocol ? ip encapsulating security payload (esp) for authentication and/or encryption. ? aes-128-gmac (128-bit key) engine ? ipv4 and ipv6 support (without options or extensions) 1. ipsec functionality is present in the 82576eb sku. ipsec is removed from the 82576ns sku. y 256 256 y y y y nn table 1-8. 82576 manageability features (continued)
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 55 ? off-load ipsec for up to 256 security associations (sa) for each of tx and rx. ? ah and esp protocols for authentication and encryption aes-128-gmac and aes-128-gcm crypto engines: ? transport mode encapsulation ? ipv4 and ipv6 versions (no options or extension headers) 1.6.2 security the 82576 supports the ieee 802.1ae specification. it incorporates an inline packet crypto unit to support both privacy and integrity checks on a packet by packet basis. the transmit data path includes both encryption and signing engines. on the receive data path, the 82576 includes a decryption engine and an integrity checker. the crypto engines use an aes gcm algorithm that is designed to support the 802.1ae protocol. note that both host traffic and mc management traffic might be subjected to authentication and/or encryption. 1.6.3 transmit rate limiting (trl) the 82576 supports the ability to limit the transmiting rate. trl can be enabled for each transmit queue. the following modes of trl are used: ? frame overhead ? ipg is extended by a fixed value for all transmit queues. ? payload rate ? ipg, stretched relative to frame size, provides pre-determined data (bytes) rates for each transmit queue. 1.6.4 performance the 82576 improvements include: ? latency - the 82576 reduces end-to-end latency for high priority traffic in presence of other traffic. specifically, the 82576 reduces the delay caused by preceding tso packets. ? cpu utilization - the 82576 supports reducing cpu utilization in a virtualized system by incorporating enhancements to the vmdq feature. 1.6.4.1 tx descriptor write-back this functionality is an improvement to the way tx descriptors are written back to memory. instead of writing back the dd bit into the descriptor location, the head pointer is updated in system memory. the head pointer is updated based on the rs bit or prior to expiration of the corresponding interrupt vector. 1.6.5 rx and tx queues the number of tx and rx queues in the 82576 was increased to 16 queues. 1.6.6 interrupts the following changes in the interrupt scheme are implemented in the 82576: ? rate controlling of low latency interrupts
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 56 ? extensions to the low latency interrupt filters to enable immediate interrupt by full 5-tuple matching 1.6.7 virtualization 1.6.7.1 pci sr iov the 82576 supports the pci-sig single-root i/o virtualization and sharing specification (sr-iov), including the following functionality: ? support for up to 8 virtual functions. ? partial replication of pci configuration space ? allocation of mmio space per virtual function ? allocation of a requester id per virtual function ? virtualization of interrupts 1.6.7.2 packets classification received unicast packets are forwarded to the appropriate vm queue based on their unicast l2 address. broadcast and multicast (mc) packets, however, might need to be forwarded to multiple vms. multicast is commonly used to share information among a group of systems. received mc packets are forwarded to their destination vms based on mapping between the mc address and the target vms. broadcast packets that are vlan tagged are forwarded to destination vms based on their vlan tag. note that a vm might be associated with multiple vlan addresses. a broadcast packet that is not vlan tagged can be optionally forwarded to all vms. packet forwarding services inter-vm communication by forwarding transmit packets from a transmit queue to an rx software queue. the motivation to execute packet forwarding in the 82576 is in direct assignment architecture, where it is desired that a guest vm interacts directly with the 82576 using a standard device driver. if packet forwarding is to be done by system software, the guest vm (its device driver) needs to filter local packets and forward those to a software switch to forward. transmit packets with a local destination are classified based on the same criteria as packets received from the wire. 1.6.7.3 hardware virtualization this section covers replication of hardware resources beyond the scope of pci resources handled by pci sr-iov. the following features are supported: ? interrupts ? part of the interrupts are assigned per vm. ? statistics ? enable read access to vms in direct assignment model without the clear-on-read side effect. ? storm control - if an unusually high bandwidth of broadcast or multicast packets is detected, the 82576 can be configured to drop broadcast or multicast packets until the storm condition is over. ? security features: vlan and mac anti-spoof are supported as well as insertion of vlan according to the physical function control.
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 57 1.6.7.4 bandwidth allocation the 82576 allows allocation of transmit bandwidth among the virtual interfaces to avoid unfair use of bandwidth by a single vm. 1.6.8 vpd the 82576 supports the vital product data (vpd) capability defined in the pci specification, version 3.0. 1.6.9 64 bit bars support the 82576 supports different configuration of the i/o and mmio base address registers to allow support of 64 bit mappings of bars. 1.6.10 ieee 1588 - precision time protocol (ptp) the ieee 1588 international standard enables networked ethernet equipment to synchronize internal clocks according to a network master clock. the protocol is implemented in software, with the 82576 providing accurate time measurements of special tx and rx packets close to the ethernet link. these packets measure the latency between the master clock and an end-point clock in both link directions. the endpoint can then acquire an accurate estimate of the master time by compensating for link latency. the 82576 provides the following support for the ieee 1588 protocol: ? detection of specific ptp rx packets and capturing the time of arrival of such packets in dedicated csrs ? detection of specific ptp tx packets and capturing the time of transmission of such packets in dedicated csrs ? a software-visible reference clock for the previously mentioned time captures. ? both the l2 based and the udp based version of the protocol are supported. ? generation of an external clock on one of the sdps. ? triggering of external devices based on internal clock. ? timestamps of external events. 1.7 device data flows 1.7.1 transmit data flow tx data flow provides a high level description of all data/control transformation steps needed for sending ethernet packets to the line.
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 58 1.7.2 receive data flow receive (rx) data flow provides a high level description of all data/control transformation steps needed for receiving ethernet packets. table 1-10. transmit data flow step description 1 the host creates a descriptor ring and configures one of the 82576's transmit queues with the address location, length, head and tail pointers of the ring (one of 16 available tx queues). 2 the host is requested by the tcp/ip stack to transmit a packet, it gets the packet data within one or more data buffers. 3 the host initializes descriptor(s) that point to the data buffer(s) and have additional control parameters that describe the needed hardware functionality. the host places that descriptor in the correct location at the appropriate tx ring. 4 the host updates the appropriate queue tail pointer (tdt) 5 the 82576's dma senses a change of a specific tdt and as a result sends a pcie request to fetch the descriptor(s) from host memory. 6 the descriptor(s) content is received in a pcie read completion and is written to the appropriate location in the descriptor queue internal cache. 7 the dma fetches the next descriptor from the internal cache and processes its content. as a result, the dma sends pcie requests to fetch the packet data from system memory. 8 the packet data is received from pcie completions and passes through the transmit dma that performs all programmed data manipulations (various cpu off loading tasks as checksum off load, tso off load, etc.) on the packet data on the fly. 9 while the packet is passing through the dma, it is stored into the transmit fifo. after the entire packet is stored in the transmit fifo, it is forwarded to the transmit switch module. 10 if the packet destination is also local, it is sent also to the local switch memory and join the receive path. 11 the transmit switch arbitrates between host and management packets and eventually forwards the packet to the security engine. 12 the security engine optionally applies l3 (ipsec) or l2 (macsec) encryption or authentication and forwards the packet to the mac. 13 the mac appends the l2 crc to the packet and sends the packet to the line using a pre-configured interface. 14 when all the pcie completions for a given packet are done, the dma updates the appropriate descriptor(s). 15 after enough descriptors are gathered for write back or the interrupt moderation timer expires, the descriptors are written back to host memory using pcie posted writes. alternatively, the head pointer can only be written back. 16 after the interrupt moderation timer expires, an interrupt is generated to notify the host device driver that the specific packet has been read to the 82576 and the driver can release the buffers. table 1-11. receive data flow step description 1 the host creates a descriptor ring and configures one of the 82576's receive queues with the address location, length, head, and tail pointers of the ring (one of 16 available rx queues). 2 the host initializes descriptors that point to empty data buffers. the host places these descriptors in the correct location at the appropriate rx ring.
introduction ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 59 3 the host updates the appropriate queue tail pointer (rdt). 4 the 82576's dma senses a change of a specific rdt and as a result sends a pcie request to fetch the descriptors from host memory. 5 the descriptors content is received in a pcie read completion and is written to the appropriate location in the descriptor queue internal cache. 6 a packet enters the rx mac. the rx mac checks the crc of the packet. 7 the mac forwards the packet to an rx filter 8 if the packet is a macsec or an ipsec packet and the adequate key is stored in the hardware, the packet is decrypted and authenticated. 9 if the packet matches the pre-programmed criteria of the rx filtering, it is forwarded to the rx fifo. vlan and crc are optionally stripped from the packet and l3/l4 checksum are checked and the destination queue is fixed. 10 the receive dma fetches the next descriptor from the internal cache of the appropriate queue to be used for the next received packet. 11 after the entire packet is placed into the rx fifo, the receive dma posts the packet data to the location indicated by the descriptor through the pcie interface. if the packet size is greater than the buffer size, more descriptors are fetched and their buffers are used for the received packet. 12 when the packet is placed into host memory, the receive dma updates all the descriptor(s) that were used by packet data. 12 after enough descriptors are gathered for write back or the interrupt moderation timer expires or the packet requires immediate forwarding, the receive dma writes back the descriptor content along with status bits that indicate the packet information including what off loads were done on that packet. 13 after the interrupt moderation timer completes or an immediate packet is received, the 82576 initiates an interrupt to the host to indicate that a new received packet is already in host memory. 14 host reads the packet data and sends it to the tcp/ip stack for further processing. the host releases the associated buffers and descriptors once they are no longer in use. table 1-11. receive data flow (continued)
intel ? 82576eb gbe controller ? introduction intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 60 note: this page intentionally left blank.
pin interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 61 2.0 pin interface 2.1 pin assignment the 82576 is packaged in 25mmx25mm fcbga package with 1 mm ball pitch. 2.1.1 pcie the ac specification for these pins is described in chapter 11.0 . table 2-1. signal type definition type description dc specification in input is a standard input-only signal. see section 11.4.2.2 out totem pole output is a standard active driver. see section 11.4.2.2 t/s tri-state is a bi-directional, tri-state input/output pin. see section 11.4.2.2 o/d open drain allows multiple devices to share as a wire-or. see section 11.4.2.3 nc-si-in input signal see section 11.4.2.4 nc-si-out output signal see section 11.4.2.4 a analog phy signals see section 11.4.5 a-in analog input signals see section 11.4.4 a-out analog output signals see section 11.4.4 b input bias see section 11.4.7 table 2-2. pci* pins symbol ball # type name and function pe_clk_p pe_clk_n n2 n1 a-in pcie* differential reference clock in: a 100mhz differential clock input. this clock is used as the reference clock for the pcie* tx/rx circuitry and by the pcie* core pll to generate clocks for the pcie* core logic. pet_0_p pet_0_n d2 d1 a- out pcie* serial data output: a serial differential output pair running at 2.5gb/s. this output carries both data and an embedded 2.5ghz clock that is recovered along with data at the receiving end. pet_1_p pet_1_n h2 h1 a- out pcie* serial data output: a serial differential output pair running at 2.5gb/s. this output carries both data and an embedded 2.5ghz clock that is recovered along with data at the receiving end.
intel ? 82576eb gbe controller ? pin interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 62 2.1.2 flash and eeprom ports (8) the ac specification for these pins is described in section 11.4.3.4 to section 11.4.3.5 . pet_2_p pet_2_n r2 r1 a- out pcie* serial data output: a serial differential output pair running at 2.5gb/s. this output carries both data and an embedded 2.5ghz clock that is recovered along with data at the receiving end. pet_3_p pet_3_n w2 w1 a- out pcie* serial data output: a serial differential output pair running at 2.5gb/s. this output carries both data and an embedded 2.5ghz clock that is recovered along with data at the receiving end. per_0_p per_0_n f2 f1 a-in pcie* serial data input: a serial differential input pair running at 2.5gb/s. an embedded clock present in this input is recovered along with the data. per_1_p per_1_n k2 k1 a-in pcie* serial data input: a serial differential input pair running at 2.5gb/s. an embedded clock present in this input is recovered along with the data. per_2_p per_2_n u2 u1 a-in pcie* serial data input: a serial differential input pair running at 2.5gb/s. an embedded clock present in this input is recovered along with the data. per_3_p per_3_n aa2 aa1 a-in pcie* serial data input: a serial differential input pair running at 2.5gb/s. an embedded clock present in this input is recovered along with the data. pe_wake_n ac20 o/d wake: pulled to ?0? to indicate that a power management event (pme) is pending and the pci express link should be restored. defined in the pci express specifications. pe_rst_n ac9 in power and clock good indication: indicates that power and pci express reference clock are within specified values. defined in the pci express specifications. this pin is used as a fundamental reset indication for the device. rsvdm3_nc rsvdm2_nc m3 m2 a- out analog testing pe_rcomp l1 b impedance compensation. connect to ground through an external 1.4 kohm 1% 100ppm resistor for impedance compensation. see figure 11-13 for details. table 2-3. flash and eeprom ports symbol ball # type name and function flsh_si ac14 t/s serial data output to the flash flsh_so ad14 in serial data input from the flash flsh_sck ad15 t/s flash serial clock operates at ~20mhz. flsh_ce_n ac15 t/s flash chip select output ee_di a21 t/s data output to eeprom ee_do a20 in data input from eeprom ee_sk b20 t/s eeprom serial clock operates at ~2mhz. ee_cs_n b21 t/s eeprom chip select output table 2-2. pci* pins (continued)
pin interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 63 2.1.3 system management bus (smb) interface the ac specification for these pins is described in section 11.4.3.3 . 2.1.4 nc-si interface pins the ac specification for these pins is described in section 11.4.3.6 . table 2-4. nc-si interface pins symbol ball # type name and function ncsi_clk_in b5 nc-si-in nc-si reference clock input ? synchronous clock reference for receive, transmit and control interface. it is a 50mhz clock /- 50 ppm. ncsi_clk_out b4 nc-si-out nc-si reference clock output ? synchronous clock reference for receive, transmit and control interface. it is a 50mhz clock /- 50 ppm. serves as a clock source to the mc and the 82576 (when configured so). ncsi_crs_dv a4 nc-si-out crs/dv ? carrier sense / receive data valid. ncsi_rxd_1 ncsi_rxd_0 a6 b7 nc-si-out receive data ? data signals from the 82576 to bmc. ncsi_tx_en b6 nc-si-in transmit enable. ncsi_txd_1 ncsi_txd_0 a7 b8 nc-si-in transmit data ? data signals from mc to the 82576. ncsi_arb_out b3 nc-si-out/ nc-si-in nc-si hw arbitration token output pin. ncsi_arb_in ad3 nc-si-in nc-si hw arbitration token input pin.
intel ? 82576eb gbe controller ? pin interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 64 2.1.5 miscellaneous pins the ac specification for the xtal pins is described in sections 11.4.6 . 2.1.6 serdes/sgmii pins the ac specification for these pins is described in section 11.4.4 . table 2-5. miscellaneous pins symbol ball # type name and function sdp0_0 sdp0_1 sdp0_2 sdp0_3 a16 b16 b17 b15 t/s sw defined pins for function 0: these pins are reserved pins that are software programmable w/rt input/output capability. these default to inputs upon power up, but may have their direction and output values defined in the eeprom. the sdp bits may be mapped to the general purpose interrupt bits when configured as inputs. the sdp0[0] pin can be used as a watchdog output indication. all the sdp pins can be used as sfp sideband signals (txdisable, present & txfault). the 82576 does not use these signals; it is available for sw control over sfp. sdp1_0 sdp1_1 sdp1_2 sdp1_3 ad10 a12 a13 ac10 t/s nc-si t/s t/s sw defined pins for function 1: reserved pins that are software programmable write/read input/output capability. these default to inputs upon power up, but may have their direction and output values defined in the eeprom. the sdp bits may be mapped to the general purpose interrupt bits when configured as inputs. the sdp1[0] pin can be used as a watchdog output indication. all the sdp pins can be used as sfp sideband signals (txdisable, present & txfault). the 82576 does not use these signals; it is available for sw control over sfp. main_pwr_ok ad4 in main power ok ? indicates that platform main power is up. must be connected externally to main core 3.3v power. dev_off_n b9 in device off: assertion of dev_off_n puts the device in device disable mode. this pin is asynchronous and is sampled once the eeprom is ready to be read following power-up. the dev_off_n pin should always be connected to vcc3p3 to enable device operation. xtal1 xtal2 n23 n24 a-in a-out reference clock / xtal: these pins may be driven by an external 25mhz crystal or driven by a single ended external cmos compliant 25mhz oscillator. table 2-6. serdes/sgmii pins symbol ball # type name and function srdsi_0_p srdsi_0_n j23 j24 a-in serdes/sgmii serial data input port 0: differential serdes receive interface. a serial differential input pair running at 1.25gb/s. an embedded clock present in this input is recovered along with the data. srdso_0_p srdso_0_n k23 k24 a-out serdes/sgmii serial data output port 0: differential serdes transmit interface. a serial differential output pair running at 1.25gb/s. this output carries both data and an embedded 1.25ghz clock that is recovered along with data at the receiving end. srds_0_sig_det a9 in port 0 signal detect: indicates that signal (light) is detected from the fiber. high for signal detect, low otherwise.
pin interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 65 2.1.7 sfp pins the ac specification for these pins is described in chapter 11.0 . 2.1.8 media dependent interface (phy?s mdi) pins 2.1.8.1 led?s (8) the table below describes the functionality of the led output pins. default activity of the led may be modified in the eeprom words 1ch and 1fh. the led functionality is reflected and can be further modified in the configuration registers ledctl. srdsi_1_p srdsi_1_n t23 t24 a-in serdes/sgmii serial data input port 1: differential fiber serdes receive interface. a serial differential input pair running at 1.25gb/s. an embedded clock present in this input is recovered along with the data. srdso_1_p srdso_1_n r23 r24 a-out serdes/sgmii serial data output port 1: differential fiber serdes transmit interface. a serial differential output pair running at 1.25gb/s. this output carries both data and an embedded 1.25ghz clock that is recovered along with data at the receiving end. srds_1_sig_det a10 in port 1 signal detect: indicates that signal (light) is detected from the fiber. high for signal detect, low otherwise. ser_rcomp l22 b impedance compensation. connect to ground through an external 1.4 kohm 1% 100ppm resistor for impedance compensation. see figure 11-13 for details. table 2-7. led output pins symbol ball # type name and function led0_0 a19 out port 0 led0. programmable led which indicates by default link up. led0_1 b19 out port 0 led1. programmable led which indicates by default activity (when packets are transmitted or received that match mac filtering). led0_2 b18 out port 0 led2. programmable led which indicates by default a 100mbps link. led0_3 a18 out port 0 led3. programmable led which indicates by default a 1000mbps link. led1_0 ad13 out port 1 led0. programmable led which indicates by default link up. led1_1 ac11 out port 1 led1. programmable led which indicates by default activity (when packets are transmitted or received that match mac filtering). led1_2 ac13 out port 1 led2. programmable led which indicates by default a 100mbps link. led1_3 ac12 out port 1 led3. programmable led which indicates by default a 1000mbps link. table 2-6. serdes/sgmii pins (continued)
intel ? 82576eb gbe controller ? pin interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 66 2.1.8.2 analog pins the ac specification for these pins is described in sections chapter 11.0 . 2.1.9 testability pins 2.1.10 reserved pins and no-connects table 2-8. testability pins symbol ball # type name and function jtck ac6 in jtag clock input jtdi ad7 in jtag tdi input jtdo ac8 o/d jtag tdo output jtms ac7 in jtag tms input rsvdac5_3p3 ac5 in jtag reset input (optional) aux_pwr b14 t/s auxiliary power available: when set, indicates that auxiliary power is available and the device should support d3cold power state if enabled to do so. this pin is also used for testing and scan. lan1_dis_n a15 t/s this pin is a strapping option pin latched at the rising edge of pe_rst# or in-band pcie* reset. this pin has an internal weak pull-up resistor. in case this pin is not connected or driven hi during init time, lan 1 is enabled. in case this pin is driven low during init time, lan 1 function is disabled. this pin is also used for testing and scan. lan0_dis_n b13 t/s this pin is a strapping option pin latched at the rising edge of pe_rst# or in-band pcie* reset. this pin has an internal weak pull-up resistor. in case this pin is not connected or driven hi during init time, lan 0 is enabled. in case this pin is driven low during init time, lan 0 function is disabled. this pin is also used for testing and scan. table 2-9. reserved pins and no-connects symbol ball # rsvdab18_n c ab1 8 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvdab19_n c ab1 9 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors.
pin interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 67 rsvdac16_n c ac1 6 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvdac17_n c ac1 7 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvdad16_n c ad1 6 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvdad17_n c ad1 7 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvdm2_nc m2 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvdm23_nc m23 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvdm24_nc m24 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvdm3_nc m3 reserved, no-connect. these pins are reserved by intel and may have factory test functions. for normal operation, do not connect any circuitry to these pins. do not connect pull-up or pull-down resistors. rsvda8_3p3 a8 reserved, vcc3p3. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vcc3p3. do not connect them to pull-up resistors. rsvda11_3p3 a11 reserved, vcc3p3. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vcc3p3. do not connect them to pull-up resistors. rsvdb10_3p3 b10 reserved, vcc3p3. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vcc3p3. do not connect them to pull-up resistors. rsvdb11_3p3 b11 reserved, vcc3p3. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vcc3p3. do not connect them to pull-up resistors. rsvdb12_3p3 b12 reserved, vcc3p3. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vcc3p3. do not connect them to pull-up resistors. rsvdad9_3p 3 ad9 reserved, vcc3p3. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vcc3p3. do not connect them to pull-up resistors. rsvdac5_3p 3 ac5 reserved, vcc3p3. these pins are reserved by intel and may have factory test functions. for normal operation, connect directly to vcc3p3 with a 10k ohm pull-up resister. rsvdl14_1p0 l14 reserved, vcc1p0. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vcc1p0. do not connect them to pull-up resistors. rsvdp14_1p0 p14 reserved, vcc1p0. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vcc1p0. do not connect them to pull-up resistors. rsvdad8_vs s ad8 reserved, vss. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vss. do not connect them to pull-down resistors. rsvda14_vs s a14 reserved, vss. these pins are reserved by intel and may have factory test functions. for normal operation, connect them directly to vss. do not connect them to pull-down resistors. ncac3 ac3 reserved, no connect. this pin is not connected internally. table 2-9. reserved pins and no-connects (continued)
intel ? 82576eb gbe controller ? pin interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 68 2.1.11 power supply pins 2.2 pull-ups/pull-downs the table below lists internal & external pull-up resistors and their functionality in different device states. table 2-10. power supply pins symbol ball # type name and function vcc3p3 ad6, ad12 3.3v 3.3v power input top vcc3p3 a5, a17 3.3v 3.3v power input bottom vcc1p0 r14, r13, r12, r11, p13, p12, l13, l12, k14, k13, k12, k11 1v 1v power digital vcc1p8 p9, p8,p5, p4, n9, n8, n5, n4, m9, m8, m5, m4, l9, l8, l5, l4 1.8v 1.8v analog power input pcie* vcc1p8 l15, k15, j15, h15, g15, e20, e19, d20, d19, aa20, aa19,y20, y19, v15, u15, t15, r15, p15, n21, n15, m21, m15 1.8v 1.8v analog power input phy vcc1p0 v5, v4, u5, u4, p11, n11, m11, l11, h5, h4, g5, g4 1.0v 1.0v analog power input pcie* vcc1p0 j21, j20, j18, j17, l21, l20, l18, l17, k21, k20, k18, k17, t21, t20, t18, t17, p21, p20, p18, p17, r21, r20, r18, r17 1.0v 1.0v analog power input phy vss y9, y8, y7, y6, y15, y14, y13, y12, y11, y10, w9, w8, w7, w14, w13, w12, w11, w10, v9, v8, v14, v13, v12, v11, v10, u9, u14, u13, u12, u11, u10, t14, t13, t12, t11, n14, n13, n12, m14, m13, m12, j14, j13, j12, j11, h9, h14, h13, h12, h11, h10, g9, g8, g14, g13, g12, g11, g10, f9, f8, f7, f14, f13, f12, f11, f10, e9, e8, e7, e6, e15, e14, e13, e12, e11, e10, d9, d8, d7, d6, d5, d16, d15, d14, d13, d12, d11, d10, c9, c8, c7, c6, c5, c4, c17, c16, c15, c14, c13, c12, c11, c10, b2, b1, ad5, ad2, ad11, ad1, ac4, ac2, ac1, ab9, ab8, ab7, ab6, ab5, ab4, ab17, ab16, ab15, ab14, ab13, ab12, ab11, ab10, aa9, aa8, aa7, aa6, aa5, aa16, aa15, aa14, aa13, aa12, aa11, aa10, a3, a2, a1 0 v digital ground vss y24, y23, y21, y18, y17, y16, w22, w21, w20, w19, w18, w17, w16, w15, v22, v21, v20, v19, v18, v17, v16, u24, u23, u22, u21, u20, u19, u18, u17, u16, t22, t19, t16, r22, r19, r16, p24, p23, p22, p19, p16, n22, n20, n19, n18, n17, n16, m22, m20, m19, m18, m17, m16, l24, l23, l19, l16, k22, k19, k16, j22, j19, j16, h24, h23, h22, h21, h20, h19, h18, h17, h16, g22, g21, g20, g19, g18, g17, g16, f22, f21, f20, f19, f18, f17, f16, f15, e24, e23, e21, e18, e17, e16, d22, d21, d18, d17, c22, c21, c20, c19, c18, b24, b23, ad24, ad23, ac24, ac23, ab22, ab21, ab20, aa22, aa21, aa18, aa17, a24, a23 0 v phy analog ground vss y5, y4, y3, y2, y1, w6, w5, w4, w3, v7, v6, v3, v2, v1, u8, u7, u6, u3, t9, t8, t7, t6, t5, t4, t3, t2, t10, t1, r9, r8, r7, r6, r5, r4, r3, r10, p7, p6, p3, p2, p10, p1, n7, n6, n3, n10, m7,m6, m10, m1, l7, l6, l3, l2, l10, k9, k8, k7, k6, k5, k4, k3, k10, j9, j8, j7, j6, j5, j4, j3, j2, j10, j1, h8, h7, h6, h3, g7, g6, g3, g2, g1, f6, f5, f4, f3, e5, e4, e3, e2, e1, d4, d3, c3, c2, c1, ab3, ab2, ab1, aa4, aa3 0v pcie* analog ground
pin interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 69 each internal pup has a nominal value of 5k ? , ranging from 2.7k ? to 8.6k ? .. the recommended values for external resistors are 400 ? for pull down resistors and 3k ?? for pull up resistors. the device states are defined as follow: ? power-up = while 3.3v is stable, yet 1.0v isn?t ? active = normal mode (not power up or disable) ? disable = device disable (a.k.a. dynamic iddq ? see see section 4.4 ) table 2-11. pull-up resistors signal name power up active disable external pup comments pup comments pup comments pe_wake_n n n n y pe_rst_n y n n n flsh_si y n y n flsh_so y y y n flsh_sck y n y n flsh_ce_n y n y n ee_di y n y n ee_do y y y n ee_sk y n y n ee_cs_n y n y n smbd n n n y smbclk n n n y smbalrt_n n n n y rsvdad17_nc y n n n rsvdac17_nc y n n n rsvdac16_nc y n y hiz n rsvdad16_nc y n y hiz n nc-si_clk_in n hiz n n pd ( note 1 ) nc-si_clk_out y hiz n n if active, stable output n nc-si_crs_dv n hiz n n pd nc-si_rxd[1:0] y hiz n n y ( note 2 ) nc-si_tx_en n hiz n n pd ( note 1 ) nc-si_txd[1:0] n hiz n n pd ( note 1 ) nc-si_arb_in n y controlled by eeprom y controlled by eeprom nc-si_arb_out y y y
intel ? 82576eb gbe controller ? pin interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 70 sdp0[3:0] y y until eeprom done n may keep state by eeprom control n sdp1[3:0] y y until eeprom done n n dev_off_n y n n must be connected on board main_pwr_ok y n n must be connected on board srds_0_sig_det y n n must be connected externally srds_1_sig_det y n n must be connected externally sfp0_i2c_clk y n y y if active sfp0_i2c_data y n n y sfp1_i2c_clk y n y y if active sfp1_i2c_data y n n y led0_0 y n n hiz led0_1 y n n hiz led0_2 y n n hiz led0_3 y n n hiz led1_0 y n n hiz led1_1 y n n hiz led1_2 y n n hiz led1_3 y n n hiz jtck y n n n jtdi y n n y jtdo y n n y jtms y n n y aux_pwr y n n pu or pd ( note 3 ) table 2-11. pull-up resistors (continued) signal name power up active disable external pup comments pup comments pup comments
pin interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 71 2.3 strapping the following signals are used for static configuration. unless otherwise stated, strapping options are latched on the rising edge of internal_power_on_reset, at power up, at in-band pci express reset and at pe_rst_n assertion. at other times, they revert to their standard usage. lan1_dis_n y y when input y pu or pd ( note 4 ) lan0_dis_n y y when input y pu or pd ( note 4 ) notes: 1. should be pulled down if nc-si interface is disabled. 2. only if nc-si is unused or set to multi drop configuration. 3. if aux power is connected, should be pulled up, else should be pulled down. 4. if the specific function is disabled, should be pulled down, else should be pulled up. table 2-12. strapping options purpose pin polarity pull-up / pull-down lan1 disable lan1_dis_n 0b ? lan1 is disabled 1b ? lan1 is enabled internal pull-up lan0 disable lan1_dis_n 0b ? lan0 is disabled 1b ? lan0 is enabled internal pull-up aux_pwr aux_pwr 0b ? aux power is not available 1b ? aux power is available none table 2-11. pull-up resistors (continued) signal name power up active disable external pup comments pup comments pup comments
intel ? 82576eb gbe controller ? pin interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 72 2.4 interface diagram figure 2-1. 82576 interface diagram
pin interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 73 2.5 pin list (alphabetical) table 2-13 lists the pins and signals in pin alphabetical order. note that where multiple pins are listed, the list sorts by the lowest pin designator. vss pins are in table 2-14 . table 2-13. pin list (alphabetical by pin designation) signal pin signal pin signal pin nc- si_crs_dv a4 led0_2 b18 rsvdm2_nc m2 vcc3p3 a5, a17 led0_1 b19 rsvdm3_nc m3 nc-si_rxd_1 a6 ee_sk b20 rsvdm23_nc m23 nc-si_txd_1 a7 ee_cs_n b21 rsvdm24_nc m24 rsvda8_3p3 a8 ieee_atest0_ n b22 srds_0_sig_ det a9 pe_clk_n n1 srds_1_sig_ det a10 mdi0_n_0 c23 pe_clk_p n2 rsvda11_3p3 a11 mdi0_p_0 c24 xtal1 n23 sdp1_1 a12 pet_0_n d1 xtal2 n24 sdp1_2 a13 pet_0_p d2 vcc1p8 p9, p8,p5, p4, n9, n8, n5, n4, m9, m8, m5, m4, l9, l8, l5, l4 rsvda14_vss a14 mdi0_n_1 d23 rsvdp14_1p 0 p14 lan1_dis_n a15 mdi0_p_1 d24 sdp0_0 a16 rbias0 e22 pet_2_n r1 led0_3 a18 per_0_n f1 pet_2_p r2 led0_0 a19 per_0_p f2 vcc r14, r13, r12, r11, p13, p12, l13, l12, k14, k13, k12, k11 ee_do a20 mdi0_n_2 f23 srdso_1_p r23 ee_di a21 mdi0_p_2 f24 srdso_1_n r24 ieee_atest0_ p a22 mdi0_n_3 g23 srdsi_1_p t23 nc- si_arb_out b3 mdi0_p_3 g24 srdsi_1_n t24 nc- si_clk_out b4 pet_1_n h1 per_2_n u1 nc-si_clk_in b5 pet_1_p h2 per_2_p u2
intel ? 82576eb gbe controller ? pin interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 74 nc-si_tx_en b6 vcc1p0 j21, j20, j18, j17, l21, l20, l18, l17, k21, k20, k18, k17, t21, t20, t18, t17, p21, p20, p18, p17, r21, r20, r18, r17 mdi1_n_3 v23 nc-si_rxd_0 b7 srdsi_0_p j23 mdi1_p_3 v24 nc-si_txd_0 b8 srdsi_0_n j24 vcc1p0 v5, v4, u5, u4, p11, n11, m11, l11, h5, h4, g5, g4 dev_off_n b9 per_1_n k1 rsvdb10_3p3 b10 per_1_p k2 rsvdb11_3p3 b11 srdso_0_p k23 pet_3_n w1 rsvdb12_3p3 b12 srdso_0_n k24 pet_3_p w2 lan0_dis_n b13 pe_rcomp l1 mdi1_n_2 w23 aux_pwr b14 vcc1p8 p9, p8,p5, p4, n9, n8, n5, n4, m9, m8, m5, m4, l9, l8, l5, l4 mdi1_p_2 w24 sdp0_3 b15 rsvdl14_1p0 l14 rbias1 y22 sdp0_1 b16 vcc1p8 l15, k15, j15, h15, g15, e20, e19, d20, d19, aa20, aa19,y20, y19, v15,u15, t15, r15, p15, n21, n15, m21, m15 sdp0_2 b17 ser_rcomp l22 per_3_n aa1 sdp1_3 ac10 vcc3p3 ad6, ad12 per_3_p aa2 led1_1 ac11 jtdi ad7 mdi1_n_1 aa23 led1_3 ac12 rsvdad8_vs s ad8 mdi1_p_1 aa24 led1_2 ac13 rsvdad9_3p 3 ad9 rsvdab18_nc ab18 flsh_si ac14 sdp1_0 ad10 rsvdab19_nc ab19 flsh_ce_n ac15 led1_0 ad13 mdi1_n_0 ab23 sfp1_i2c_dat a/mdio1 ac18 flsh_so ad14 mdi1_p_0 ab24 sfp1_i2c_clk /mdc1 ac19 flsh_sck ad15 ncac3 ac3 pe_wake_n ac20 sfp0_i2c_da ta/mdio0 ad18 rsvdac5_3p3 ac5 smbclk ac21 sfp0_i2c_cl k/mdc0 ad19 table 2-13. pin list (alphabetical by pin designation) (continued) signal pin signal pin signal pin
pin interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 75 2.6 ball out this section provides a top view ball map of the 82576 in a 25 mmx25 mm package. some names in the layout are not accurate (short names were chosen to fit). see figure 2-2 for the color key for the ball out table. jtck ac6 ieee_atest1_ n ac22 smbalrt_n ad20 jtms ac7 smbd ad21 jtdo ac8 nc-si_arb_in ad3 ieee_atest1 _p ad22 pe_rst_n ac9 main_pwr_o k ad4 table 2-14. vss pins signal pin vss y24, y23, y21, y18, y17, y16, w22, w21, w20, w19, w18, w17, w16, w15, v22, v21, v20, v19, v18, v17, v16, u24, u23, u22, u21, u20, u19, u18, u17, u16, t22, t19, t16, r22, r19, r16, p24, p23, p22, p19, p16, n22, n20, n19, n18, n17, n16, m22, m20, m19, m18, m17, m16, l24, l23, l19, l16, k22, k19, k16, j22, j19, j16, h24, h23, h22, h21, h20, h19, h18, h17, h16, g22, g21, g20, g19, g18, g17, g16, f22, f21, f20, f19, f18, f17, f16, f15, e24, e23, e21, e18, e17, e16, d22, d21, d18, d17, c22, c21, c20, c19, c18, b24, b23, ad24, ad23, ac24, ac23, ab22, ab21, ab20, aa22, aa21, aa18, aa17, a24, a23, y5, y4, y3, y2, y1, w6, w5, w4, w3, v7, v6, v3, v2, v1, u8, u7, u6, u3, t9, t8, t7, t6, t5, t4, t3, t2, t10, t1, r9, r8, r7, r6, r5, r4, r3, r10, p7, p6, p3, p2, p10, p1, n7, n6, n3, n10, m7,m6, m10, m1, l7, l6, l3, l2, l10, k9, k8, k7, k6, k5, k4, k3, k10, j9, j8, j7, j6, j5, j4, j3, j2, j10, j1, h8, h7, h6, h3, g7, g6, g3, g2, g1, f6, f5, f4, f3, e5, e4, e3, e2, e1, d4, d3, c3, c2, c1, ab3, ab2, ab1, aa4, aa3 figure 2-2. color key for ball-out table 2-13. pin list (alphabetical by pin designation) (continued) signal pin signal pin signal pin clock/bias/ieee test pins mdi interface nc-si signals vcc1p8 vcc1p0 vss functional pin pcie signals vcc3p3 open drain reserved signals mdio/2 wire interface signals
intel ? 82576eb gbe controller ? pin interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 76 figure 2-3. ball-out representation 242322212019181716151413121110987654321 ad vss vss ieee_ate st1_p smbd smbalrt _n sfp0_i2c_ clk/mdc0 sfp0_i2c_ data/mdio 0 rsvdad1 7_nc rsvdad1 6_nc flsh_sck flsh_so led1_0 vcc3p3 vss sdp1_0 rsvdad9 _3p3 rsvdad8 _vss jtdi vcc3p3 vss main_pw r_ok ncsi_arb _in vss vss ac vss vss ieee_ate st1_n smbclk pe_wake _n sfp1_i2c_ clk/mdc1 sfp1_i2c_ data/mdio 1 rsvdac1 7_nc rsvdac1 6_nc flsh_ce_ n flsh_si led1_2 led1_3 led1_1 sdp1_3 pe_rst_n jtdo jtms jtck rsvdac5 _3p3 vss ncac3 vss vss ab mdi1_p_0 mdi1_n_0 vss vss vss rsvdab19 _nc rsvdab18 _nc vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss aa mdi1_p_1 mdi1_n_1 vss vss vcc1p8 vcc1p8 vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss per_3_p per_3_n y vss vss rbias1 vss vcc1p8 vcc1p8 vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss w mdi1_p_2 mdi1_n_2 vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss pet_3_p pet_3_n v mdi1_p_3 mdi1_n_3 vss vss vss vss vss vss vss vcc1p8 vss vss vss vss vss vss vss vss vss vcc1p0_p e vcc1p0_p e vss vss vss u vss vss vss vss vss vss vss vss vss vcc1p8 vss vss vss vss vss vss vss vss vss vcc1p0_p e vcc1p0_p e vss per_2_p per_2_n t srdsi_1_ n srdsi_1_ p vss vcc1p0 vcc1p0 vss vcc1p0 vcc1p0 vss vcc1p8 vss vss vss vss vss vss vss vss vss vss vss vss vss vss r srdso_1 _n srdso_1 _p vss vcc1p0 vcc1p0 vss vcc1p0 vcc1p0 vss vcc1p8 vcc1p0 vcc1p0 vcc1p0 vcc1p0 vss vss vss vss vss vss vss vss pet_2_p pet_2_n p vss vss vss vcc1p0 vcc1p0 vss vcc1p0 vcc1p0 vss vcc1p8 rsvdp14_ nc vcc1p0 vcc1p0 vcc1p0 vss vcc1p8 vcc1p8 vss vss vcc1p8 vcc1p8 vss vss vss n xtal2 xtal1 vss vcc1p8 vss vss vss vss vss vcc1p8 vss vss vss vcc1p0 vss vcc1p8 vcc1p8 vss vss vcc1p8 vcc1p8 vss pe_clk_p pe_clk_n m rsvdm24 _nc rsvdm23 _nc vss vcc1p8 vss vss vss vss vss vcc1p8 vss vss vss vcc1p0 vss vcc1p8 vcc1p8 vss vss vcc1p8 vcc1p8 rsvdm3_ nc rsvdm2_ nc vss l vss vss ser_rco mp vcc1p0 vcc1p0 vss vcc1p0 vcc1p0 vss vcc1p8 rsvdl14_ 1p0 vcc1p0 vcc1p0 vcc1p0 vss vcc1p8 vcc1p8 vss vss vcc1p8 vcc1p8 vss vss pe_rcom p k srdso_0 _n srdso_0 _p vss vcc1p0 vcc1p0 vss vcc1p0 vcc1p0 vss vcc1p8 vcc1p0 vcc1p0 vcc1p0 vcc1p0 vss vss vss vss vss vss vss vss per_1_p per_1_n j srdsi_0_ n srdsi_0_ p vss vcc1p0 vcc1p0 vss vcc1p0 vcc1p0 vss vcc1p8 vss vss vss vss vss vss vss vss vss vss vss vss vss vss h vss vss vss vss vss vss vss vss vss vcc1p8 vss vss vss vss vss vss vss vss vss vcc1p0 vcc1p0 vss pet_1_p pet_1_n g mdi0_p_3 mdi0_n_3 vss vss vss vss vss vss vss vcc1p8 vss vss vss vss vss vss vss vss vss vcc1p0 vcc1p0 vss vss vss f mdi0_p_2 mdi0_n_2 vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss per_0_p per_0_n e vss vss rbias0 vss vcc1p8 vcc1p8 vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss d mdi0_p_1 mdi0_n_1 vss vss vcc1p8 vcc1p8 vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss pet_0_p pet_0_n c mdi0_p_0 mdi0_n_0 vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss vss b vss vss ieee_ate st0_n ee_cs_n ee_sk led0_1 led0_2 sdp0[2] sdp0[1] sdp0[3] aux_pwr lan0_dis_n rsvdb12_n c rsvdb11_n c rsvdb10_ nc dev_ off_n ncsi_ txd[0] ncsi_rxd [ ncsi_tx_ e ncsi_clk _ ncsi_ clk _ ncsi_arb _out vss vss a vss vss ieee_ate st0_p ee_di ee_do led0_0 led0_3 vcc3p3 sdp0[0] lan1_dis _n rsvda14_ nc sdp1[2] sdp1[1] rsvda11_ nc srds1_ sig_ det srds0_ sig_det rsvda8_n c ncsi_txd [ ncsi_rxd [ vcc3p3 ncsi_crs _ vss vss vss
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 77 3.0 interconnects 3.1 pcie 3.1.1 pcie overview pcie is a third generation i/o architecture that enables cost competitive next generation i/o solutions providing industry leading price/performance and feature richness. it is an industry-driven specification. pcie defines a basic set of requirements that encases the majority of the targeted application classes. higher-end applications' requirements, such as enterprise class servers and high-end communication platforms, are encased by a set of advanced extensions that compliment the baseline requirements. to guarantee headroom for future applications of pcie, a software-managed mechanism for introducing new, enhanced, capabilities in the platform is provided. figure 3-1 shows pcie architecture. pcie's physical layer consists of a differential transmit pair and a differential receive pair. full-duplex data on these two point-to-point connections is self-clocked such that no dedicated clock signals are required. the bandwidth of this interface increases linearly with frequency. figure 3-1. pcie stack structure
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 78 the packet is the fundamental unit of information exchange and the protocol includes a message space to replace the various side-band signals found on many buses today. this movement of hard-wired signals from the physical layer to messages within the transaction layer enables easy and linear physical layer width expansion for increased bandwidth. the common base protocol uses split transactions and several mechanisms are included to eliminate wait states and to optimize the reordering of transactions to further improve system performance. 3.1.1.1 architecture, transaction and link layer properties ? split transaction, packet-based protocol ? common flat address space for load/store access (such as pci addressing model) ? memory address space of 32-bit to allow compact packet header (must be used to access addresses below 4 gb) ? memory address space of 64-bit using extended packet header ? transaction layer mechanisms: ? pci-x style relaxed ordering ? optimizations for no-snoop transactions ? credit-based flow control ? packet sizes/formats: ? maximum packet size supports 128 byte and 256 byte data payload ? maximum read request size of 512 bytes ? reset/initialization: ? frequency/width/profile negotiation performed by hardware ? data integrity support ? using crc-32 for transaction layer packets ? link layer retry for recovery following error detection ? using crc-16 for link layer messages ? no retry following error detection ? 8b/10b encoding with running disparity ? software configuration mechanism: ? uses pci configuration and bus enumeration model ? pcie-specific configuration registers mapped via pci extended capability mechanism ? baseline messaging: ? in-band messaging of formerly side-band legacy signals (such as interrupts, etc.) ? system-level power management supported via messages ? power management: ? full support for pci-pm ? wake capability from d3cold state ? compliant with acpi, pci-pm software model ? active state power management
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 79 ? support for pcie v2.0 (2.5gt/s) ? support for completion time out ? support for additional registers in the pcie capability structure. 3.1.1.2 physical interface properties ? point to point interconnect ? full-duplex; no arbitration ? signaling technology: ? low voltage differential (lvd) ? embedded clock signaling using 8b/10b encoding scheme ? serial frequency of operation: 2.5 ghz. ? interface width of x4, x2, or x1. ? dft and dfm support for high volume manufacturing 3.1.1.3 advanced extensions pcie defines a set of optional features to enhance platform capabilities for specific usage modes. the 82576 supports the following optional features: ? advanced error reporting - messaging support to communicate multiple types/severity of errors ? device serial number - allows exposure of a unique serial number for each device. ? alternative requester id (ari) - allow support of more than 8 function per device. ? single root i/o virtualization (pci-sig sr-iov) - allows exposure of virtual functions controlling a subset of the resources to virtual machines. 3.1.2 functionality - general 3.1.2.1 native/legacy ? all the 82576 pci functions are native pcie functions. 3.1.2.2 locked transactions ? the 82576 does not support locked requests as target or master. 3.1.2.3 end to end crc (ecrc) ? not supported by the 82576
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 80 3.1.3 host i/f 3.1.3.1 tag ids pcie device numbers identify logical devices within the physical device (the 82576 is a physical device). the 82576 implements a single logical device with up to two separate pci functions: lan 0, and lan 1. the device number is captured from each type 0 configuration write transaction. each of the pcie functions interfaces with the pcie unit through one or more clients. a client id identifies the client and is included in the tag field of the pcie packet header. completions always carry the tag value included in the request to enable routing of the completion to the appropriate client. tag ids are allocated differently for read and write. messages are sent with a tag of 0x1f. 3.1.3.1.1 tag id allocation for read transactions table 3-1 lists the tag id allocation for read accesses. the tag id is interpreted by hardware in order to forward the read data to the required device. 3.1.3.1.2 tag id allocation for write transactions request tag allocation depends on these system parameters: ? dca supported/not supported in the system ( dca_ctrl.dca_dis - see section 8.13.4 for details) ? dca enabled/disabled for each type of traffic ( txctl.tx descriptor dca en, rxctl.rx descriptor dca en, rxctl.rx header dca en, rxctl.rx payload dca en ) table 3-1. ids in read transactions tag id description comment 0 reserved 1 descriptor rx like 82571/82572/82575 2 reserved 3 reserved 4 descriptor tx like 82571/82572/82575 5 reserved 6 reserved 7 reserved 8 data request 0 like 82571/82572/82575 9 data request 1 like 82571/82572/82575 0a data request 2 like 82571/82572/82575 0b data request 3 like 82571/82572/82575 10 reserved 11 message unit 12-1f reserved
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 81 ? system type: legacy dca vs. dca 1.0 ( dca_ctrl.dca_mode - see section 8.13.4 for details). ? cpu id ( rxctl.cpuid or txctl.cpuid ) since dca is implemented differently in i/oat 1 and in i/oat 2/3 platforms, the tag ids are different as well (see section 3.1.3.1.2.3 below). 3.1.3.1.2.1 case 1 - dca disabled in the system: table 3-2 describes the write requests tags. unlike read, the values are for debug only, allowing tracing of requests through the system. 3.1.3.1.2.2 case 2 - dca enabled in the system, but disabled for the request: ? legacy dca platforms - if dca is disabled for the request, the tags allocation is identical to the case where dca is disabled in the system. see table 3-2 above. ? dca 1.0 platforms - all write requests have the tag of 0x00. note: when in dca 1.0 mode, messages and msi/msi-x write requests are sent with the no-hint tag. 3.1.3.1.2.3 case 3 - dca enabled in the system, dca enabled for the request: ? legacy dca platforms: the request tag is constructed as follows: ? bit[0] ? dca enable ? bits[3:1] - the cpu id field taken from the cpuid[2:0] bits of the rxctl or txctl registers ? bits[7:4] - reserved ? dca 1.0 platforms: the request tag (all 8 bits) is taken from the cpuid field of the rxctl or txctl registers 3.1.3.2 completion timeout mechanism in any split transaction protocol, there is a risk associated with the failure of a requester to receive an expected completion. to enable requesters to attempt recovery from this situation in a standard manner, the completion timeout mechanism is defined. table 3-2. ids in write transactions, dca disabled mode tag id description 0x0 - 0x1 reserved 0x2 tx descriptors write-back / tx head write-back 0x3 reserved 0x4 rx descriptors write-back 0x5 reserved 0x6 write data 0x7 - 0x1d reserved 0x1e msi and msi-x 0x1f reserved
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 82 the completion timeout mechanism is activated for each request that requires one or more completions when the request is transmitted. the pcie specification, rev. 1.1 requires that the completion timeout timer: ? should not expire in less than 10 ms. ? must expire if a request is not completed within 50 ms. however, some platforms experience completion latencies that are longer than 50 ms, in some cases up to seconds. in pcie specification, rev 2.0 an mechanism to allow configuration of the completion timeout was added. the 82576 supports both the legacy rev. 1.1 and the default rev 2.0 mechanisms, to support the legacy mode, it provides a programmable range for the completion timeout, as well as the ability to disable completion timeout altogether. the default pcie rev 2.0 mode programs completion timeout through an extension of the pcie capability structure. the new capability structure is assigned a pcie capability structure version of 0x2. the 82576 controls the following aspects of completion timeout: ? disabling or enabling completion timeout ? disabling or enabling resending a request on completion timeout ? a programmable range of timeout values programming the behavior of the completion timeout is done differently whether capability structure version 0x1 is enabled or capability structure version 0x2 is enabled. table 3-3 lists the behavior for both cases. the capability structure exposed and the mode used are fixed by the gio_cap field in the pcie init configuration 3 eeprom word (word 0x1a). 3.1.3.2.1 completion timeout enable ? version = 0x1- loaded from the completion timeout disable bit in the eeprom (word 0x15, bit 7) into the completion_timeout_disable bit in the pcie control register (gcr). completion timeout enabled is the default. ? version = 0x2 - programmed through pcie configuration space device control 2 register (0xc8) bit 4.. visible through the completion_timeout_disable bit in the pcie control register (gcr). completion timeout enabled is the default. 3.1.3.2.2 resend request enable table 3-3. completion timeout programming capability capability structure version = 0x1 capability structure version = 0x2 completion timeout enabling loaded from eeprom into completion_timeout_disable bit in the pcie control register (gcr 0x05000). controlled through pcie configuration space device control 2 register (0xc8) bit 4. visible through read-only csr resend request enable loaded from eeprom into completion_timeout_resend bit in the pcie control register (gcr, 0x05000). same as version = 0x1 completion timeout period loaded from eeprom into csr bit. controlled through pcie configuration space device control 2 register (0xc8) bits 3:0. visible through read-only csr bit.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 83 ? version = 0x1- the completion timeout resend eeprom bit (word 0x15, bit 4) , loaded to the completion_timeout_resend bit in the pcie control register (gcr), enables resending the request (applies only when completion timeout is enabled). the default is to resend a request that timed out. ? version = 0x2 - same as when version = 0x1. 3.1.3.2.3 completion timeout period ? version = 0x1.- loaded from the completion timeout value field in the eeprom (word 0x15, bits 6:5) to the completion_timeout_value bits in the pcie control register (gcr). the following values are supported. ? version = 0x2 - programmed through pci configuration. visible through the completion_timeout_value bits in the pcie control register (gcr). the 82576 supports all four ranges defined by the pcie ecr. ? 50 us to 10 ms ? 10 ms to 250 ms ? 250 ms to 4 s ? 4 s to 64 s system software programs a range (one of nine possible ranges that sub-divide the four ranges previously mentioned) into the pcie configuration space device control 2 register (0xc8) bits 3:0. the following are supported sub-ranges. setting: completion timeout value pcie spec defined ranges ranges implemented 00 (default) 50 s to 10 ms 500 s ? 1 ms 01 10 ms to 250 ms 50 ms ? 100 ms 10 250 ms to 4 s 500 ms ? 1s 11 4 s to 64 s 10s ? 20s setting: completion timeout value device control 2 register (0xc8) bits 3:0 pcie defined ranges ranges implemented 0000 (default) 50 s- 10 ms 500 s ? 1ms 0001 50 us ? 100 s 50 s ? 100 us 0010 1 ms- 10 ms 2 ms ? 4 ms 0101 16 ms ? 55 ms 16 ms ? 32 ms 0110 65 ms ? 210 ms 65 ms ? 130 ms 1001 260 ms ? 900 ms 260 ms ? 520 ms 1010 1 s ? 3.5 s 1 s ? 2 s 1101 4 s ? 13 s 4 s ? 8 s 1110 17 s ? 64 s 17 s ? 34 s
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 84 a memory read request for which there are multiple completions is considered completed only when all completions are received by the requester. if some, but not all, requested data is returned before the completion timeout timer expires, the requestor is permitted to keep or to discard the data that was returned prior to timer expiration. note: the completion timeout value must be programmed correctly in pcie configuration space in (device control 2 register); the value must be set above the expected maximum latency for completions in the system in which the device is installed. this will ensure that the device receives the completions for the requests it sends out, avoiding a completion timeout scenario. it is expected that the system bios will set this value appropriately for the system. 3.1.4 transaction layer the upper layer of the pcie architecture is the transaction layer. the transaction layer connects to the 82576 core using an implementation specific protocol. through this core-to-transaction-layer protocol, the application-specific parts of the 82576 interact with the pcie subsystem and transmit and receive requests to or from the remote pcie agent, respectively. 3.1.4.1 transaction types accepted by the 82576 flow control types: ? ph - posted request headers ? pd - posted request data payload ? nph - non-posted request headers ? npd - non-posted request data payload ? cplh - completion headers ? cpld - completion data payload table 3-4. transaction types at the rx transaction layer transaction type fc type tx later reaction hardware should keep data from original packet for client configuration read request nph cplh + cpld requester id, tag, attribute configuration space configuration write request nph + npd cplh requester id, tag, attribute configuration space memory read request nph cplh + cpld requester id, tag, attribute csr memory write request ph + pd - - csr io read request nph cplh + cpld requester id, tag, attribute csr io write request nph + npd cplh requester id, tag, attribute csr read completions cplh + cpld - - dma message ph - - message unit / int / pm / error unit
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 85 3.1.4.1.1 configuration request retry status pcie supports devices requiring a lengthy self-initialization sequence to complete before they are able to service configuration requests as it is the case for the 82576 that might have a delay in initialization due to an eeprom read. if the read of the pcie section in the eeprom was not completed and the 82576 receives a configuration request, the 82576 responds with a configuration request retry completion status to terminate the request, and thus effectively stall the configuration request until such time that the subsystem has completed local initialization and is ready to communicate with the host. 3.1.4.1.2 partial memory read and write requests the 82576 has limited support of read and write requests when only part of the byte enable bits are set as described later in this section. partial writes to the msi-x table are supported. all other partial writes are ignored and a completion abort is sent. zero-length writes have no internal impact (nothing written, no effect such as clear-by-write). the transaction is treated as a successful operation (no error event). partial reads with at least one byte enabled are answered as a full read. any side effect of the full read (such as clear by read) is applicable to partial reads also. zero-length reads generate a completion, but the register is not accessed and undefined data is returned. 3.1.4.2 transaction types initiated by the 82576 note: max_payload_size supported is loaded from eeprom (128 bytes, 256 bytes or 512 bytes). if ari capability is not exposed, the effective max_payload_size is defined for each pci functions according to configuration space register of this function. if ari capability is exposed, effective max_payload_size is defined for all pci functions according to configuration space register of function zero 3.1.4.2.1 data alignment table 3-5. transaction types at the tx transaction layer transaction type payload size fc type from client configuration read request completion dword cplh + cpld configuration space configuration write request completion - cplh configuration space i/o read request completion dword cplh + cpld csr i/o write request completion - cplh csr read request completion dword/qword cplh + cpld csr memory read request - nph dma memory write request <= max_payload_size ph + pd dma message - ph message unit / int / pm / error unit
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 86 requests must never specify an address/length combination that causes a memory space access to cross a 4 kb boundary. the 82576 breaks requests into 4 kb-aligned requests (if needed). this does not pose any requirement on software. however, if software allocates a buffer across a 4 kb boundary, hardware issues multiple requests for the buffer. software should consider limiting buffer sizes and base addresses to comply with a 4 kb boundary in cases where it improves performance. the general rules for packet alignment are as follows: 1. the length of a single request should not exceed the pcie limit of max_payload_size for write and max_read_req for read. 2. the length of a single request does not exceed the 82576?s internal limitations 512 bytes. 3. a single request should not span across different memory pages as noted by the 4 kb boundary previously mentioned. note: the rules apply to all the 82576 requests (read/write, snoop and no snoop). if a request can be sent as a single pcie packet and still meet rules 1-3, then it is not broken at a cache-line boundary (as defined in the pcie cache line size configuration word), but rather, sent as a single packet (motivation is that the chipset might break the request along cache-line boundaries, but the 82576 should still benefit from better pcie utilization). however, if rules 1-3 require that the request is broken into two or more packets, then the request is broken at a cache-line boundary. 3.1.4.2.2 multiple tx data read requests the 82576 supports four pipe lined requests for transmit data. in general, the four requests might belong to the same packet or to consecutive packets. however, the following restriction applies: ? all requests for a packet are issued before a request is issued for a consecutive packet read requests can be issued from any of the supported queues, as long as the restriction is met. pipelined requests might belong to the same queue or to separate queues. however, as previously noted, all requests for a certain packet are issued (from same queue) before a request is issued for a different packet (potentially from a different queue). the pcie specification does not insure that completions for separate requests return in-order. read completions for concurrent requests are not required to return in the order issued. the 82576 handles completions that arrive in any order. once all completions arrive for a given request, the 82576 might issue the next pending read data request. ? the 82576 incorporates a 2 kb re-order buffer to support re-ordering of completions for four requests. each request/completion can be up to 512 bytes long. the maximum size of a read request is defined as the minimum {512, max_read_request_size}. in addition to the four pipeline requests for transmit data, the 82576 can issue a single read request for each of the tx descriptors and rx descriptors. the requests for tx data, tx descriptor, and rx descriptor are independently issued. each descriptor read request can fetch up to 16 descriptors (equal to 256 bytes of data). 3.1.4.3 messages 3.1.4.3.1 message handling by the 82576 (as a receiver) message packets are special packets that carry a message code. the upstream device transmits special messages to the 82576 by using this mechanism.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 87 the transaction layer decodes the message code and responds to the message accordingly. 3.1.4.3.2 message handling by the 82576 (as a transmitter) the transaction layer is also responsible for transmitting specific messages to report internal/external events (such as interrupts and pmes). 3.1.4.4 ordering rules the 82576 meets the pcie ordering rules (pci-x rules) by following the pci simple device model: table 3-6. supported message in the 82576 (as a receiver) message code [7:0] routing r2r1r0 message device later response 0x14 100 pm_active_state_nak internal signal set 0x19 011 pme_turn_off internal signal set 0x50 100 slot power limit support (has one dword data) silently drop 0x7e 010,011,100 vendor_defined type 0 no data unsupported request 1 1. no completion is expected for this type of packets 0x7e 010,011,100 vendor_defined type 0 data unsupported request 1 0x7f 010,011,100 vendor_defined type 1 no data silently drop 0x7f 010,011,100 vendor_defined type 1 data silently drop 0x00 011 unlock silently drop table 3-7. supported message in the 82576 (as a transmitter) message code [7:0] routing r2r1r0 message 0x20 100 assert int a 0x21 100 assert int b 0x22 100 assert int c 0x23 100 assert int d 0x24 100 de-assert int a 0x25 100 de-assert int b 0x26 100 de-assert int c 0x27 100 de-assert int d 0x30 000 err_cor 0x31 000 err_nonfatal 0x33 000 err_fatal 0x18 000 pm_pme 0x1b 101 pme_to_ack
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 88 ? deadlock avoidance - master and target accesses are independent - the response to a target access does not depend on the status of a master request to the bus. if master requests are blocked, such as due to no credits, target completions might still proceed (if credits are available). ? descriptor/data ordering - the 82576 does not proceed with some internal actions until respective data writes have ended on the pcie link: ? the 82576 does not update an internal header pointer until the descriptors that the header pointer relates to are written to the pcie link. ? the 82576 does not issue a descriptor write until the data that the descriptor relates to is written to the pcie link. the 82576 might issue the following master read request from each of the following clients: ? rx descriptor read (one for each lan port) ? tx descriptor read (two for each lan port) ? tx data read (up to four for each lan port/ one for the manageability) completion separate read requests are not guaranteed to return in order. completions for a single read request are guaranteed to return in address order. 3.1.4.4.1 out of order completion handling in a split transaction protocol, when using multiple read requests in a multi processor environment, there is a risk that completions arrive from the host memory out of order and interleaved. in this case, the 82576 sorts the request completion and transfers them to the ethernet in the correct order. 3.1.4.5 transaction definition and attributes 3.1.4.5.1 max payload size the 82576 policy to determine max payload size (mps) is as follows: ? master requests initiated by the 82576 (including completions) limits mps to the value defined for the function issuing the request. ? target write accesses to the 82576 are accepted only with a size of one dword or two dwords. write accesses in the range of (three dwords, mps, etc.) are flagged as ur. write accesses above mps are flagged as malformed. 3.1.4.5.2 traffic class (tc) and virtual channels (vc) the 82576 only supports tc=0 and vc=0 (default). 3.1.4.5.3 relaxed ordering the 82576 takes advantage of the relaxed ordering rules in pcie. by setting the relaxed ordering bit in the packet header, the 82576 enables the system to optimize performance in the following cases: ? relaxed ordering for descriptor and data reads: when the 82576 emits a read transaction, its split completion has no ordering relationship with the writes from the cpus (same direction). it should be allowed to bypass the writes from the cpus. ? relaxed ordering for receiving data writes: when the 82576 masters receive data writes, it also enables them to bypass each other in the path to system memory because software does not process this data until their associated descriptor writes complete.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 89 ? the 82576 cannot relax ordering for descriptor writes, msi/msi-x writes or pcie messages. relaxed ordering can be used in conjunction with the no-snoop attribute to enable the memory controller to advance non-snoop writes ahead of earlier snooped writes. relaxed ordering is enabled in the 82576 by clearing the ro_dis bit in the ctrl_ext register. actual setting of relaxed ordering is done for lan traffic by the host through the dca registers. 3.1.4.5.4 snoop not required the 82576 sets the snoop not required attribute bit for master data writes. system logic might provide a separate path into system memory for non-coherent traffic. the non-coherent path to system memory provides higher, more uniform, bandwidth for write requests. note: the snoop not required attribute does not alter transaction ordering. therefore, to achieve maximum benefit from snoop not required transactions, it is advisable to set the relaxed ordering attribute as well (assuming that system logic supports both attributes). in fact, some chipsets require that relaxed ordering is set for no-snoop to take effect. global no-snoop support is enabled in the 82576 by clearing the ns_dis bit in the ctrl_ext register. actual setting of no snoop is done for lan traffic by the host through the dca registers. 3.1.4.5.5 no snoop and relaxed ordering for lan traffic software might configure non-snoop and relax order attributes for each queue and each type of transaction by setting the respective bits in the rxctrl and txctrl registers. table 3-8 lists the default behavior for the no-snoop and relaxed ordering bits for lan traffic when i/ oat 2 is enabled. note: rx payload no-snoop is also conditioned by the nse bit in the receive descriptor. see section 3.1.4.5.5.1 . table 3-8. lan traffic attributes transaction no-snoop default relaxed ordering default comments rx descriptor read n y rx descriptor write-back n n relaxed ordering must never be used for this traffic. rx data write y y see the following note and section 3.1.4.5.5.1 rx replicated header n y tx descriptor read n y tx descriptor write-back n y tx tso header read n y tx data read n y
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 90 3.1.4.5.5.1 no-snoop option for payload under certain conditions, which occur when i/oat is enabled, software knows that it is safe to transfer (dma) a new packet into a certain buffer without snooping on the front-side bus. this scenario typically occurs when software is posting a receive buffer to hardware that the cpu has not accessed since the last time it was owned by hardware. this might happen if the data was transferred to an application buffer by the i/oat dma engine. in this case, software should be able to set a bit in the receive descriptor indicating that the 82576 should perform a no-snoop dma transfer when it eventually writes a packet to this buffer. when a non-snoop transaction is activated, the tlp header has a non-snoop attribute in the transaction descriptor field. this is triggered by the nse bit in the receive descriptor. see section 7.1.5 . 3.1.4.5.5.2 no snoop option for tso header as hardware reads the header of a tso request for each segment it sends, we may safely assume that after the first read of the header it is updated in the main memory. as as result, all the subsequent reads of the header might be done with the no-snoop option set. this option is triggered by setting the nosnoop_lso_hdr_buf bit in the dtxctl register. 3.1.4.6 flow control 3.1.4.6.1 82576 flow control rules the 82576 implements only the default virtual channel (vc0). a single set of credits is maintained for vc0. rules for fc updates: ? the 82576 maintains two credits for npd at any given time. it increments the credit by one after the credit is consumed and sends an updatefc packet as soon as possible. updatefc packets are scheduled immediately after a resource is available. table 3-9. allocation of fc credits credit type operations number of credits posted request header (ph) target write (one unit) message (one unit) two units (to enable concurrent accesses to both lan ports). posted request data (pd) target write (length/16 bytes=1) message (one unit) max_payload_size/16 non-posted request header (nph) target read (one unit) configuration read (one unit) configuration write (one unit) two units (to enable concurrent target accesses to both lan ports). non-posted request data (npd) configuration write (one unit) two units. completion header (cplh) read completion (n/a) infinite (accepted immediately). completion data (cpld) read completion (n/a) infinite (accepted immediately).
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 91 ? the 82576 provides two credits for ph (such as for two concurrent target writes) and two credits for nph (such as for two concurrent target reads). updatefc packets are scheduled immediately after a resource becomes available. ? the 82576 follows the pcie recommendations for frequency of updatefc fcps. 3.1.4.6.2 upstream flow control tracking the 82576 issues a master transaction only when the required fc credits are available. credits are tracked for posted, non-posted, and completions (the later to operate against a switch). 3.1.4.6.3 flow control update frequency in any case, updatefc packets are scheduled immediately after a resource becomes available. when the link is in the l0 or l0s link state, update fcps for each enabled type of non-infinite fc credit must be scheduled for transmission at least once every 30 s (-0%/+50%), except when the extended sync bit of the control link register is set, in which case the limit is 120 s (-0%/+50%). 3.1.4.6.4 flow control timeout mechanism the 82576 implements the optional fc update timeout mechanism. the mechanism is activated when the link is in l0 or l0s link state. it uses a timer with a limit of 200 s (-0%/+50%), where the timer is reset by the receipt of any init or update fcp. alternately, the timer may be reset by the receipt of any dllp. after timer expiration, the mechanism instructs the phy to re-establish the link (via the ltssm recovery state). 3.1.4.7 error forwarding if a tlp is received with an error-forwarding trailer, the packet is dropped and not delivered to its destination. the 82576 does not initiate any additional master requests for that pci function until it detects an internal reset or a software reset for the associated lan. software is able to access device registers after such a fault. system logic is expected to trigger a system-level interrupt to inform the operating system of the problem. the operating system can then stop the process associated with the transaction, re-allocate memory instead of the faulty area, etc. 3.1.5 data link layer 3.1.5.1 ack/nak scheme the 82576 supports two alternative schemes for ack/nak rate: 1. ack/nak is scheduled for transmission according to timeouts specified in the ltiv register 2. ack/nak is scheduled for transmission according to timeouts specified in the pcie specification. the pcie error recovery bit loaded from eeprom determines which of the two schemes is used.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 92 3.1.5.2 supported dllps the following dllps are supported by the 82576 as a receiver: the following dllps are supported by the 82576 as a transmitter: note: updatefc-cpl is not sent because of the infinite fc-cpl allocation. table 3-10. dllps received by the 82576 dllp type remarks ack nak pm_request_ack initfc1-p virtual channel 0 only initfc1-np virtual channel 0 only initfc1-cpl virtual channel 0 only initfc2-p virtual channel 0 only initfc2-np virtual channel 0 only initfc2-cpl virtual channel 0 only updatefc-p virtual channel 0 only updatefc-np virtual channel 0 only updatefc-cpl virtual channel 0 only table 3-11. dllps initiated by the 82576 dllp type remarks ack nak pm_enter_l1 pm_enter_l23 pm_active_state_request_l1 initfc1-p virtual channel 0 only initfc1-np virtual channel 0 only initfc1-cpl virtual channel 0 only initfc2-p virtual channel 0 only initfc2-np virtual channel 0 only initfc2-cpl virtual channel 0 only updatefc-p virtual channel 0 only updatefc-np virtual channel 0 only
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 93 3.1.5.3 transmit edb nullifying in case of a retrain necessity, there is a need to guarantee that no abrupt termination of the tx packet happens. for this reason, early termination of the transmitted packet is possible. this is done by appending an edb (end bad symbol) to the packet. 3.1.6 physical layer 3.1.6.1 link width the 82576 supports a maximum link width of x4, x2, or x1 as determined by the lane_width field in pcie init configuration 3 eeprom word. the max link width is loaded into the maximum link width field of the pcie capability register (lcap[11:6]). the hardware default is x4 link. during link configuration, the platform and the 82576 negotiate on a common link width. the link width must be one of the supported pcie link widths (x1, x2, x4), such that: ? if maximum link width = x4, then the 82576 negotiates to either x4, x2 or x1. 1 ? if maximum link width = x2, then the 82576 negotiates to either x2 or x1. ? if maximum link width = x1, then the 82576 only negotiates to x1. 3.1.6.2 polarity inversion if polarity inversion is detected, the receiver must invert the received data. during the training sequence, the receiver looks at symbols 6-15 of ts1 and ts2 as the indicator of lane polarity inversion (d+ and d- are swapped). if lane polarity inversion occurs, the ts1 symbols 6- 15 received are d21.5 as opposed to the expected d10.2. similarly, if lane polarity inversion occurs, symbols 6-15 of the ts2 ordered set are d26.5 as opposed to the expected d5.2. this provides the clear indication of lane polarity inversion. 3.1.6.3 l0s exit latency the number of fts sequences (n_fts) sent during l1 exit, is loaded from the eeprom into an 8-bit read-only register. 3.1.6.4 lane-to-lane de-skew a multi-lane link might have many sources of lane-to-lane skew. although symbols are transmitted simultaneously on all lanes, they cannot be expected to arrive at the receiver without lane-to-lane skew. the skew can include components, which are less than a bit time, bit time units (400 ps for 2.5 gb), or full symbol time units (4 ns) of skew caused by the re-timing repeaters' insert/delete operations. receivers use ts1 or ts2 or skip ordered sets (sos) to perform link de-skew functions. the 82576 supports de-skew of up to 6 symbols time (24 ns). 1. see restriction in section 3.1.6.5 .
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 94 3.1.6.5 lane reversal the following lane reversal modes are supported (see figure 3-2 ): ? lane configuration of x4, x2, and x1 ? lane reversal in x4 and in x2 ? degraded mode (downshift) from x4 to x2 to x1 and from x2 to x1, with one restriction - if lane reversal is executed in x4, then downshift is only to x1 and not to x2. note: the restriction requires that a x2 interface to the 82576 must connect to lanes 0 and 1 on the 82576. the pcie card electromechanical specification does not allow to route a x2 link to a wider connector. therefore, a system designer is not allowed to connect a x2 link to lanes 2 and 3 of a pcie connector. it is also recommended that when used in x2 mode on a nic, the 82576 is connected to lanes 0 and 1 of the nic. configuration bits: ? eeprom lane reversal disable bit - disables lane reversal altogether. see section 6.2.18, pcie control (word 0x1b) for the bit. 3.1.6.6 reset the pcie phy can supply core reset to the 82576. the reset can be caused by two sources: 1. upstream move to hot reset - inband mechanism (ltssm). 2. recovery failure (ltssm returns to detect). 3. upstream component moves to disable. figure 3-2. lane reversal supported modes
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 95 3.1.6.7 scrambler disable the scrambler/de-scrambler functionality in the 82576 can be eliminated by two mechanisms: 1. upstream according to the pcie specification. 2. eprom bit. 3.1.7 error events and error reporting 3.1.7.1 mechanism in general pcie defines two error reporting paradigms: the baseline capability and the advanced error reporting (aer) capability. the baseline error reporting capabilities are required of all pcie devices and define the minimum error reporting requirements. the aer capability is defined for more robust error reporting and is implemented with a specific pcie capability structure. both mechanisms are supported by the 82576. also the serr# enable and the parity error bits from the legacy command register take part in the error reporting and logging mechanism. figure 3-3 shows, in detail, the flow of error reporting in the 82576.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 96 3.1.7.2 error events table 3-12 lists the error events identified by the 82576 and the response in terms of logging, reporting, and actions taken. consult the pcie specification for the effect on the pci status register. figure 3-3. error reporting mechanism table 3-12. response and reporting of error events error name error events default severity action phy errors receiver error 8b/10b decode errors packet framing error correctable. send err_corr tlp to initiate nak and drop data. dllp to drop. data link errors bad tlp ? bad crc ? not legal edb ? wrong sequence number correctable. send err_corr tlp to initiate nak and drop data.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 97 bad dllp ? bad crc correctable. send err_corr dllp to drop. replay timer timeout ? replay_timer expiration correctable. send err_corr follow ll rules. replay num rollover ? replay num rollover correctable. send err_corr follow ll rules. data link layer protocol error ? received ack/nack not corresponding to any tlp uncorrectable. send err_fatal follow ll rules. tlp errors poisoned tlp received ? tlp with error forwarding uncorrectable. err_nonfatal log header a poisoned completion is ignored and the request can be retried after timeout. if enabled, the error is reported. unsupported request (ur) ? wrong configuration access ? mrdlk ? configuration request type 1 ? unsupported vendor defined type 0 message ? not valid msg code ? not supported tlp type ? wrong function number ? wrong tc/vc ? received target access with data size > 64-bit ? received tlp outside address range uncorrectable. err_nonfatal log header send completion with ur. completion timeout ? completion timeout timer expired uncorrectable. err_nonfatal send the read request again. completer abort ? attempts to write to the flash device when writes are disabled (eec.fwe=01b) uncorrectable. err_nonfatal log header send completion with ca. unexpected completion ? received completion without a request for it (tag, id, etc.) uncorrectable. err_nonfatal log header discard tlp. receiver overflow ? received tlp beyond allocated credits uncorrectable. err_fatal receiver behavior is undefined. flow control protocol error ? minimum initial flow control advertisements ? flow control update for infinite credit advertisement uncorrectable. err_fatal receiver behavior is undefined. the 82576 doesn?t report violations of flow control initialization protocol table 3-12. response and reporting of error events (continued)
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 98 3.1.7.3 error pollution error pollution can occur if error conditions for a given transaction are not isolated to the error's first occurrence. if the physical layer detects and reports a receiver error, to avoid having this error propagate and cause subsequent errors at upper layers the same packet is not signaled at the data link or transaction layers. similarly, when the data link layer detects an error, subsequent errors that occur for the same packet are not signaled at the transaction layer. 3.1.7.4 completion with unsuccessful completion status a completion with unsuccessful completion status is dropped and not delivered to its destination. the request that corresponds to the unsuccessful completion is retried by sending a new request for the data that was not delivered. 3.1.7.5 error reporting changes the rev. 1.1 specification defines two changes to advanced error reporting. a new role-based error reporting bit in the device capabilities register is set to 1b to indicate that these changes are supported by the 82576. 1. setting the serr# enable bit in the pci command register also enables ur reporting (in the same manner that the serr# enable bit enables reporting of correctable and uncorrectable errors). in other words, the serr# enable bit overrides the ur error reporting enable bit in the pcie device control register. 2. changes in the response to some uncorrectable non-fatal errors, detected in non-posted requests to the 82576. these are called advisory non-fatal error cases. for each of the errors that follow, the following behavior is defined: malformed tlp (mp) ? data payload exceed max_payload_size ? received tlp data size does not match length field ? td field value does not correspond with the observed size ? byte enables violations. ? power management messages that don?t use tc0. ? usage of unsupported vc uncorrectable. err_fatal log header drop the packet and free fc credits. completion with unsuccessful completion status no action (already done by originator of completion). free fc credits. byte count integrity in completion process. when byte count isn?t compatible with the length field and the actual expected completion length. for example, length field is 10 (in dword), actual length is 40, but the byte count field that indicates how many bytes are still expected is smaller than 40, which is not reasonable. no action the 82576 doesn't check for this error and accepts these packets. this may cause a completion timeout condition. table 3-12. response and reporting of error events (continued)
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 99 a. the advisory non-fatal error status bit is set in the correctable error status register to indicate the occurrence of the advisory error and the advisory non-fatal error mask corresponding bit in the correctable error mask register is checked to determine whether to proceed further with logging and signaling. b. if the advisory non-fatal error mask bit is clear, logging proceeds by setting the corresponding bit in the uncorrectable error status register, based upon the specific uncorrectable error that's being reported as an advisory error. if the corresponding uncorrectable error bit in the uncorrectable error mask register is clear, the first error pointer and header log registers are updated to log the error, assuming they are not still occupied by a previously unserviced error. c. an err_cor message is sent if the correctable error reporting enable bit is set in the device control register. an error_nonfatal message is not sent for this error. the following uncorrectable non-fatal errors are considered as advisory non-fatal errors: ? a completion with an unsupported request or completer abort (ur/ca) status that signals an uncorrectable error for a non-posted request. if the severity of the ur/ca error is non-fatal, the completer must handle this case as an advisory non-fatal error. ? when the requester of a non-posted request times out while waiting for the associated completion, the requester is permitted to attempt to recover from the error by issuing a separate subsequent request, or to signal the error without attempting recovery. the requester is permitted to attempt recovery zero, one, or multiple (finite) times, but must signal the error (if enabled) with an uncorrectable error message if no further recovery attempt is made. if the severity of the completion timeout is non-fatal and the requester elects to attempt recovery by issuing a new request, the requester must first handle the current error case as an advisory non-fatal error. ? when a receiver receives an unexpected completion and the severity of the unexpected completion error is non-fatal, the receiver must handle this case as an advisory non-fatal error. 3.1.8 performance monitoring the 82576 incorporates pcie performance monitoring counters to provide common capabilities to evaluate performance. the 82576 implements four 32-bit counters to correlate between concurrent measurements of events as well as the sample delay and interval timers. the four 32-bit counters can also operate in a two 64-bit mode to count long intervals or payloads. software can reset, stop, or start the counters (all at the same time). the list of events supported by the 82576 and the counters control bits are described in the memory register map ( section 8.6 ). some counters operate with a threshold - the counter increments only when the monitored event crossed a configurable threshold (such as the number of available credits is below a threshold). counters operate in the following modes: ? count mode - the counter increments when the respective event occurred. ? leaky bucket mode - the counter increments only when the rate of events exceeded a certain value. see section 3.1.8.1 . 3.1.8.1 leaky bucket mode each of the counters may be configured independently to operate in a leaky bucket mode. when in leaky bucket mode, the following functionality is provided: ? one of four 16-bit leaky bucket counters (lbc) is enabled via the lbc enable [3:0] bits in the pcie statistic control register #1.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 100 ? the lbc is controlled by the gio_count_start , gio_count_stop , and gio_count_reset bits in the pcie statistic control register #1. ? the lbc increments every time the respective event occurs. ? the lbc is decremented every t ms as defined in the lbc timer field in the pcie statistic control registers. ? when an event occurs and the value of the lbc meets or exceeds the threshold defined in the lbc threshold field in the pcie statistic control registers, the respective statistics counter increments. 3.1.9 pcie power management described in section 5.4.1 - power management. 3.1.10 pcie programming interface described in section 9.0 - pcie programming interface 3.2 management interfaces see chapter 10.0, system manageability . the 82576 contains 2 possible interfaces to an external bmc. ? smbus ?nc-si since the manageability sideband throughput is lower than the network link throughput, the 82576 allocates an 8 kb internal buffer for incoming network packets prior to being sent over the sideband interface. 3.2.1 smbus smbus is an optional interface for pass-through and/or configuration traffic between an external mc and the 82576. the smbus commands used to configure or read status from the 82576 are described in chapter 10.0, system manageability . 3.2.1.1 channel behavior 3.2.1.1.1 smbus addressing the smbus addresses that the 82576 responds to depend on the lan mode (teaming/non-teaming). when the lan is in teaming mode (fail-over), the 82576 is presented over the smbus as one device along with one smbus address. when in non-teaming mode in the lan ports, the smbus is presented as two smbus devices on the smbus along with two smbus addresses. in dual-address mode all pass- through functionality is duplicated on the smbus address, where each smbus address is connected to a different lan port. note: do not configure both ports to the same address. when a lan function is disabled, the corresponding smbus address is not presented to the external bmc.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 101 the smbus address method is defined through the smbus addressing mode bit in the eeprom. the smbus addresses are set by smbus address 0 and smbus address 1 in the eeprom. note: if the single-address mode is set, only smbus address 0 field is valid. the smbus addresses (those that are enabled from the eeprom) can be re-assigned using the smbus arp protocol. besides the smbus address values, all the previously stated parameters of the smbus (smbus channel selection, address mode, address enable) can be set only through eeprom configuration. the eeprom is read on the 82576 at power-up, resets, and other cases described in section 4.2 . all smbus addresses should be in network byte order (nbo); most significant byte first. 3.2.1.1.2 smbus notification methods the 82576 supports three methods of informing the external mc that it has information that is needed to be read by an external bmc: ? smbus alert. ? asynchronous notify. ? direct receive. the notification method that is used by the 82576 can be configured from the smbus using the receive enable command. the default method is set from the eeprom in the pt init field. the following events cause the 82576 to send a notification event to the external bmc: ? receiving a lan packet that was designated to the bmc. ? receiving a request status command from the mc initiates a status response (see section 10.5.10.2.2 ). ? status change has occurred and the 82576 is configured to notify the external mc upon one of the status changes. the following event triggers a notification to the bmc: ? a change in any of the status data 1 bits of the read status command (see section 10.5.10.2.2 for description of this command). ? a circuit breaker indication - indicates matching of a circuit breaker filter (or of its counter/ threshold). there might be cases where the external mc is hung and is unable to respond to the smbus notification. the 82576 has a time-out value defined in the eeprom (see section 6.8 ) to avoid hanging while waiting for the notification response. if the mc does not respond until the timeout expires, the notification is de-asserted. 3.2.1.1.2.1 smbus alert and alert response method the smbus alert# signal is an additional smbus signal that acts as an asynchronous interrupt signal to an external smbus master. the 82576 asserts this signal each time it has a message that it needs the external mc to read and if the chosen notification method is the smbus-alert method. note that the smbus alert is an open-drain signal, which means that other devices besides the 82576 can be connected on the same alert pin and the external mc needs a mechanism to distinguish between the alert sources as described:
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 102 the external mc can respond to the alert by issuing an ara cycle (see figure 3-13 ) to detect the alert source device. the 82576 responds to the ara cycle (if it was the smbus alert source) and de-asserts the alert when the ara cycle completes. following the ara cycle, the external mc issues a read command to retrieve the 82576 message. some bmcs do not implement ara cycle transactions. these bmcs respond to an alert by issuing a read command to the 82576 (0xc0/0xd0 or 0xde). the 82576 always responds to a read command, even if it is not the source of the notification. the default response is a status transaction. if the 82576 is the source of the smbus alert, it replies to the read transaction and de-asserts the alert after the command byte of the read transaction. the ara cycle is an smbus receive byte transaction to smbus address 0001-100b. note that the ara transaction does not support pec. the ara transaction format is as follows: note: since the master-receiver (bmc receiver) is involved in the transaction, it must signal the end of data by generating a nack (a ?1? in the ack bit position) on the slave device address byte that was clocked out. this releases the data line to allow the master to generate a stop condition. 3.2.1.1.2.2 asynchronous notify method when configured to asynchronous notify method, the 82576 acts as smbus master and notifies the external mc by issuing a modified form of the write word transaction. the asynchronous notify transaction smbus address and data payload is configured using the receive enable command or using the eeprom defaults. note that the asynchronous notify method is not protected by a pec byte. the target address and data byte low/high is taken from the receive enable command (see section 10.5.10.2.6 ) or eeprom configuration (see section 6.8 ). table 3-13. smbus ara cycle format 17 11811 s alert response address rd a slave device address a p 0001 100 1 0 manageability slave smbus address 1 table 3-14. asynchronous notify command format 1711711 s target address wr a sending device address a ? ? ? bmc slave address 0 0 manageability slave smbus address 0 0 81 8 11 data byte low a data byte high a p interface 0 alert value 0
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 103 3.2.1.1.2.3 direct receive method if configured, the 82576 has the capability to send the message it needs to transfer to the external mc as a master over the smbus, instead of alerting the bmc, and waiting for it to read the message. table 3-15 shows the message. note that the ?f?, ?l? and command fields in the message are the same as the op-code returned by the 82576 in response to a mc receive tco packet block read command (see section 10.5.10.2.1 ). the rules for the ?f? and ?l? flags are also the same as used in the receive tco packet block read command. 3.2.1.1.3 receive tco flow the 82576 is used as a channel for receiving packets from the network link and passing them to the external bmc. the mc can configure the 82576 to pass specific packets to the mc as described in section 10.5.10.1.5 . once a full packet is received from the link and identified as a manageability packet that should be transferred to the bmc, the 82576 starts the receive tco transaction flow to the bmc. the maximum smbus fragment length is defined in the eerpom (see section 6.8.2 ). the 82576 uses the smbus notification method to notify the mc that it has data to deliver. the packet is divided into fragments, where the 82576 uses the maximum fragment size allowed in each fragment. the last fragment of the packet transfer is always the status of the packet. as a result, the packet is transferred in at least two fragments. the data of the packet is transferred in the receive tco lan packet transaction as described in section 10.5.10.2.1 . when smbus alert is selected as the mc notification method, the 82576 notifies the mc on each fragment of a multi-fragment packet. when asynchronous notify is selected as the mc notification method, the 82576 notifies the mc only on the first fragment of a received packet. it is bmc?s responsibility to read the full packet including all the fragments. any timeout on the smbus notification results in discarding the entire packet. any nack by the mc on one of the 82576 receive bytes also causes the packet to be silently discarded. the maximum size of the received packet is limited by the 82576 hardware to 1536 bytes. packets larger then 1536 bytes are silently discarded. any packet smaller than 1536 bytes is processed by the 82576. note: when the rcv_en bit is cleared, all receive tco functionality is disabled, not just the packets that are directed to the mc (also auto arp packets). table 3-15. direct receive transaction format 171111 6 1 s target address wr a f l command a ? ? ? bmc slave address 0 0 first flag last flag receive tco command 01 0000b 0 81 8 1 1 8 11 byte count a data byte 1 a ? ? ? a data byte n a p n0 0 0 0
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 104 3.2.1.1.4 transmit tco flow the 82576 is used as a channel for transmitting packets from the external mc to the network link. the network packet is transferred from the external mc over the smbus, and then, when fully received by the 82576, is transmitted over the network link. in dual-address mode, each smbus address is connected to a different lan port. when a packet is received in smbus transactions using smbus address 0 , it is transmitted to the network using lan port 0 and is transmitted through lan port 1, if received on smbus address 1 . in single-address mode, the transmitted port is selected according to the fail-over algorithm (see section 3.2.1.1.9 ). the 82576 supports packets up to the ethernet packet length (1536 bytes). smbus transactions can be up to 240 bytes in length, which means that packets can be transferred over the smbus in more than one fragment. in each command byte there are the f and l bits. when the f bit is set, it means that this is the first fragment of the packet; l means that it is the last fragment of the packet. note: when both flags are set, the entire packet is in one fragment. the packet is sent over the network link, only after all its fragments are received correctly over the smbus. the 82576 calculates the l2 crc on the transmitted packet and adds its four bytes at the end of the packet. any other packet field (such as xsum) must be calculated and inserted by the external mc (the 82576 does not change any field in the transmitted packet, besides adding padding and crc bytes). note: if the packet sent by the mc is larger than 1536 bytes, then the packet is silently discard by the 82576. the minimum packet length defined by the 802.3 specification is 64 bytes. the 82576 pads packets that are less than 64 bytes to meet the specification requirements. there is one exception, when the packet sent over the smbus is less than 32 bytes, the external mc must pad it for at least 32 bytes. the passing bytes value should be zero. note: packets that are smaller then 32 bytes (including padding) are silently discarded by the 82576. if the network link goes down at anytime while the 82576 is receiving the packet, it silently discards the packet. note that any link down event during the transfer of a packet over the smbus (after received from the network), does not stop the operation. the transmit smbus transactions are described in section 10.5.5.2 . 3.2.1.1.5 transmit errors in sequence handling once a packet is transferred over the smbus from the mc to the 82576, the f and l flags should follow specific rules. the f flag defines that this is the first fragment of the packet; the l flag defines that the transaction contains the last fragment of the packet. the following table lists the different options regarding the flags in transmit packet transactions: table 3-16. flags in transmit packet transactions previous current action/notes last first accepts both.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 105 note that since every other block write command in the tco protocol has both f and l flags off, they cause flushing any pending transmit fragments that were previously received. in other words, when running the tco transmit flow, no other block write transactions are allowed in between the fragments. 3.2.1.1.6 tco command aborted flow bit 6 in first byte of the status returned from the 82576 to the external mc indicates that there was a problem with previous smbus transactions or with the completion of the operation requested in previous transaction. an abort can be asserted for any of the following reasons: ? any error in the smbus protocol (nack, smbus timeouts). ? any error in compatibility between required protocols to specific functionality (receive enable command with byte count not 1/14 as defined in the command specification). ? if the 82576 does not have space to store the transmit packet from the mc (in its internal buffer before sending it to the link). in this case, the entire transaction completes, but the packet is discarded and the mc is notified about it through the abort bit. ? error in the f / l bit sequence during multi-fragment transactions. ? the abort bit is asserted after an internal reset to the 82576 manageability unit. note: an abort in the status does not always imply that the last transaction of the sequence was incorrect. there is a gap between the time the status is read from the 82576 and the time the transaction occurred. 3.2.1.1.7 concurrent smbus transactions concurrent smbus write transactions are not permitted. once a transaction is started, it must be completed before additional transaction can be initiated. 3.2.1.1.8 smbus arp functionality the 82576 supports smbus arp protocol as defined in the smbus 2.0 specification. the 82576 is a persistent slave address device meaning that its smbus address is valid after power-up and loaded from the eeprom. the 82576 supports all smbus arp commands defined in the smbus specification, both general and directed. note: smbus arp can be disabled through eeprom configuration (see section 6.8.3 ). smbus-arp transactions are described in section 10.5.5.2 . last not first error for current transaction. current transaction is discarded and an abort status is asserted. not last first error for previous transaction. previous transaction (until previous first) is discarded. current packet is processed. no abort status is asserted. not last not first processes the current transaction. table 3-16. flags in transmit packet transactions (continued)
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 106 3.2.1.1.8.1 smbus arp in dual-/single-address mode the 82576 operates either in single smbus address mode or in dual smbus address mode. these modes reflect on its smbus-arp behavior. when operating in single-address mode, the 82576 presents itself on the smbus as one device and responds to smbus-arp as one device only. in this case, its smbus address is smbus address 0 as defined in the eeprom smbus arp addresses word (see section 6.7.32 and section 6.7.33 ). the 82576 has only one ar flag and one av flag. the vendor specific id, which is the mac address of the lan's port, is taken from the port 0 address. in dual-address mode, the 82576 responds as two smbus devices, meaning that it has two sets of ar / av flags (one for each port). the 82576 responds twice to the smbus-arp master, one time for each port. both smbus addresses are taken from the smbus arp addresses word of the eeprom. the udid is different between the two ports in the vendor specific id field, which represent the mac address, which is different between the two ports. it is recommended for the 82576 to first answer as port 0, and only when the address is assigned, to start answering as port 1 to the get udid command. 3.2.1.1.8.2 smbus arp flow smbus-arp flow is based on the status of two flags: ? av - address valid - this flag is set when the 82576 has a valid smbus address. ? ar - address resolved - this flag is set when the 82576?s smbus address is resolved (smbus address was assigned by the smbus-arp process). note: these flags are internal the 82576 flags and not shown to external smbus devices. since the 82576 is a persistent smbus address (psa) device, the av flag is always set, while the ar flag is cleared after power-up until the smbus-arp process completes. since the av flag is always set, the 82576 always has a valid smbus address. when the smbus master needs to start an smbus-arp process, it resets (in terms of arp functionality) all the devices on the smbus by issuing either prepare to arp or reset device commands. when the 82576 accepts one of these commands, it clears its ar flag (if set from previous smbus-arp process), but not its av flag (the current smbus address remains valid until the end of the smbus arp process). the meaning of an ar flag cleared is that the 82576 answers the following smbus arp transactions that are issued by the master. the smbus master then issues a get udid command (general or directed), to identify the devices on the smbus. the 82576 responds to the directed command all the time and to the general command only if its ar flag is not set. after the get udid command, the master assigns the 82576?s smbus address by issuing an assign address command. the 82576 checks whether the udid matches its own udid, and if they match, it switches its smbus address to the address assigned by the command (byte 17). after accepting the assign address command, the ar flag is set and from this point on (as long as the ar flag is set), the 82576 does not respond to the get udid general command, while all other commands should be processed even if the ar flag is set. the 82576 stores the smbus address that was assigned in the smbus-arp process in its eeprom, so after the next power-up, it returns to its assigned smbus address. figure 3-4 shows the smbus-arp behavior of the 82576.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 107 3.2.1.1.8.3 smbus arp udid content the unique device identifier (udid) provides a mechanism to isolate each device for the purpose of address assignment. each device has a unique identifier. the 128-bit number is comprised of the following fields: figure 3-4. smbus arp flow
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 108 where: ? vendor id ? the device manufacturer's id as assigned by the sbs implementers' forum or the pci sig ? constant value: 0x8086. ? device id ? the device id as assigned by the device manufacturer (identified by the vendor id field) - constant value: 0x10c9. ? interface ? identifies the protocol layer interfaces supported over the smbus connection by the device - in this case, smbus version 2.0 - constant value: 0x0004. ? sub-system fields ? these fields are not supported and return zeros. device capabilities: dynamic and persistent address, pec support bit: version/revision: udid version 1, silicon revision: silicon revision id: table 3-17. unique device identifier (udid) 1 byte 1 byte 2 bytes 2 bytes 2 bytes 2 bytes 2 bytes 4 bytes device capabilities version / revision vendor id device id interface sub-system vendor id sub- system device id vendor specific id see below see below 0x8086 0x10c9 0x0004 0x0000 0x0000 see below msb lsb table 3-18. dynamic and persistent address, pec support bit 76543210 address type reserved (0) reserved (0) reserved (0) reserved (0) reserved (0) pec supported 0b 1b 0b 0b 0b 0b 0b 0b msb lsb table 3-19. version/revision: udid version 1, silicon revision 7 6543210 reserved (0) reserved (0) udid version silicon revision id 0b 0b 001b see below msb lsb table 3-20. silicon revision id silicon version revision id a1 001b a1/b0 001b c0 010b
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 109 vendor specific id - four lsb bytes of the 82576?s ethernet mac address. the 82576?s ethernet address is taken from words 0b-2b in the eeprom. note that in the 82576 there are two mac addresses (one for each port). bit 0 of the port 1 mac address has the inverted value of bit 0 from the eeprom. 3.2.1.1.9 lan fail-over through smbus in fail-over mode, the 82576 determines which ports are used for transmit and receive (according to the configuration). lan fail-over is tied to the smbus addressing mode. when the smbus is dual- address mode, the 82576 does not activate its fail-over mechanism (ignores the fail-over register) and operates using individual lan ports. when the smbus is in single-address mode or in pass-through mode, the 82576 operates in fail-over mode. see section 10.5.11 . 3.2.2 nc-si the nc-si interface in the 82576 is a connection to an external mc defined by the dmtf nc-si protocol. it operates as a single interface with an external bmc, where all traffic between the 82576 and the mc flows through the interface. 3.2.2.1 electrical characteristics the 82576 complies with the electrical characteristics defined in the nc-si specification. however, the 82576 pads are not 5v tolerant and require that signals conform to 3.3v signaling. the 82576 nc-si behavior is configured by the 82576 on power-up: ? the 82576 provides an nc-si clock output if enabled by the nc-si clock direction eeprom bit. the default value is to use an external clock source as defined in the nc-si specification. ? the output driver strength for the nc-si_clk_out pad is configured by the eeprom nc-si clock pad drive strength bit (default = 0b). ? the output driver strength for the nc-si output signals (nc-si_dv & nc-si_rx) is configured by the eeprom nc-si data pad drive strength bit (default = 0b). ? the multi-drop nc-si eeprom bit defines the nc-si topology (point-to-point or multi-drop; the default is point-to-point). the 82576 can provide an nc-si clock output as previously mentioned. the nc-si clock input (nc- si_clk_in) serves as an nc-si input clock in either case. that is, if the 82576 provides an nc-si output clock, the platform is required to route it back through the nc-si clock input with the correct latency. see the electrical chapter for more details. the 82576 dynamically drives its nc-si output signals (nc-si_dv and nc-si_rx) as required by the sideband protocol: ? on power-up, the 82576 floats the nc-si outputs ? if the 82576 operates in point-to-point mode, then the 82576 starts driving the nc-si outputs at some time following power-up table 3-21. vendor specific id 1 byte 1 byte 1 byte 1 byte mac address, byte 3 mac address, byte 2 mac address, byte 1 mac address, byte 0 msb lsb
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 110 ? if the 82576 operates in a multi-drop mode, the 82576 drives the nc-si outputs as configured by the bmc. 3.2.2.2 nc-si transactions the nc-si link supports both pass-through traffic between the mc and the 82576 lan functions, as well as configuration traffic between the mc and the 82576 internal units as defined in the nc-si protocol. see 3.3 flash / eeprom 3.3.1 eeprom interface 3.3.1.1 general overview the 82576 uses an eeprom device for storing product configuration information. the eeprom is divided into three general regions: ? hardware accessed - loaded by the 82576 after power-up, pci reset de-assertion, d3 ->d0 transition, or a software-commanded eeprom read (ctrl_ext.ee_rst). ? manageability firmware accessed - loaded by the 82576 in pass-through mode after power-up or firmware reset. ? software accessed - used only by software. the meaning of these registers, as listed here, is a convention for software only and is ignored by the 82576. table 3-22 lists the structure of the eeprom image in the 82576. the eeprom mapping is described in section 6.0 . table 3-22. eeprom structure address content 0x0 ? 0x9 mac address and software area 0xa ? 0x2f hardware area (+ pointer to analog configuration) 0x30 ? 0x3f pxe area 0x40 ? 0x4f reserved 0x50 ? 0x5a fw pointers ? firmware structures ? vpd area ? analog configuration (pcie/phy/pll/serdes structures)
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 111 3.3.1.2 eeprom device the eeprom interface supports an spi interface and expects the eeprom to be capable of 2 mhz operation. the 82576 is compatible with many sizes of 4-wire serial eeprom devices. different eeprom sizes have differing numbers of address bits (8 bits or 16 bits). software must be aware when doing direct access. see section 11.5.2, eeprom device options . 3.3.1.3 software accesses the 82576 provides two different methods for software access to the eeprom. it can either use the built-in controller to read the eeprom or access the eeprom directly using the eeprom's 4-wire interface. in addition, the vpd area of the eeprom can be accessed via the vpd capability structure of the pcie. software can use the eeprom read (eerd) register to cause the 82576 to read a word from the eeprom that the software can then use. to do this, software writes the address to read to the read address (eerd.addr) field simultaneously writes a 1b to the start read bit (eerd.start). the 82576 reads the word from the eeprom, sets the read done bit (eerd.done), and puts the data in the read data field (eerd.data). software can poll the eeprom read register until it sees the read done bit set and then uses the data from the read data field. any words read this way are not written to the 82576's internal registers. software can also directly access the eeprom's 4-wire interface through the eeprom/flash control (eec) register. it can use this for reads, writes, or other eeprom operations. to directly access the eeprom, software should follow these steps: 1. write a 1b to the eeprom reques t bit (eec.ee_req). 2. read the eeprom grant bit (eec.ee_gnt) until it becomes 1b. it remains 0b as long as the hardware is accessing the eeprom. 3. write or read the eeprom using the direct access to the 4-wire interface as defined in the eeprom/ flash control and data (eec) register. the exact protocol used depends on the eeprom placed on the board and can be found in the appropriate datasheet. 4. write a 0b to the eeprom request bit (eec.ee_req). finally, software can cause the 82576 to re-read the hardware accessed fields of the eeprom (setting the 82576's internal registers appropriately) by writing a 1b to the eeprom reset bit of the extended device control register (ctrl_ext.ee_rst). note: if the eeprom does not contain a valid signature (see section 3.3.1.4 ), the 82576 assumes 16-bit addressing. in order to access an eeprom that requires 8-bit addressing, software must use the direct access mode.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 112 3.3.1.4 signature field the 82576 determines if an eeprom is present by attempting to read it. the 82576 first reads the eeprom sizing and protected fields word at address 0x12. it checks the signature value for bits 15 and 14. if bit 15 is 0b and bit 14 is 1b, it considers the eeprom to be present and valid and reads additional eeprom words and then programs its internal registers based on the values read. otherwise, it ignores the values it reads from that location and does not read any other words as part of the auto-read process. however, the eeprom is still accessible to software. 3.3.1.5 protected eeprom space the 82576 provides a mechanism for a hidden area in the eeprom to the host. the hidden area cannot be accessed via the eeprom registers in the csr space. it can be accessed only by the manageability subsystem. this area is located at the end of the eeprom memory. it?s size is defined by the hepsize field in eeprom word 0x12. note that the current the 82576 firmware does not use this mechanism. a mechanism to protect part of the eeprom from host writes is also provided. this mechanism is controlled by word 0x2d and 0x2c that controls the start and the end of the read-only area. 3.3.1.5.1 initial eeprom programming in most applications, initial eeprom programming is done directly on the eeprom pins. nevertheless, it is desired to enable existing software utilities (accessing the eeprom via the host interface) to initially program the entire eeprom without breaking the protection mechanism. following a power-up sequence, the 82576 reads the hardware initialization words in the eeprom. if the signature in word 0x12 does not equal 01b, the eeprom is assumed as non-programmed. there are two effects of a non- valid signature: ? the 82576 does not read any further eeprom data and sets the relevant registers to default. ? the 82576 enables access to any location in the eeprom via the eeprom csr registers. 3.3.1.5.2 activating the protection mechanism following initialization, the 82576 reads the eeprom and turns on the protection mechanism if word 0x12 contains a valid signature (equals 01b) and word 0x12, bit 4 is set (enable protection). once the protection mechanism is turned on, words 0x12, 0x2c and 0x2d become write-protected, the area that is defined by word 0x12 becomes hidden (such as read/write protected) and the area defined by words 0x2c and 0x2d become write protected. ? no matter what is designated as the read only protected area, words 0x30:0x3f (used by pxe driver) are writeable, unless it is defined as hidden. 3.3.1.5.3 non permitted accessing to protected areas in the eeprom this paragraph refers to eeprom accesses via the eec (bit banging) or eerd (parallel read access) registers. following a write access to the protected areas in the eeprom, hardware responds properly on the pcie interface but does not initiate any access to the eeprom. following a read access to the hidden area in the eeprom (as defined by word 0x12), hardware does not access the eeprom and returns meaningless data to the host. note: using bit banging, the spi eeprom can be accessed in a burst mode. for example, providing op-code, address, and then read or write data for multiple bytes. hardware inhibits any attempt to access the protected eeprom locations even in burst accesses.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 113 software should not access the eeprom in a burst-write mode starting in a non-protected area and continue to a protected one. in such a case it is not guaranteed that the write access to any area ever takes place. 3.3.1.6 eeprom recovery the eeprom contains fields that if programmed incorrectly might affect the functionality of the 82576. the impact can range from an incorrect setting of some function (such as led programming), via disabling of entire features (such as no manageability) and link disconnection, to the inability to access the 82576 via the regular pcie interface. the 82576 implements a mechanism that enables recovery from a faulty eeprom no matter what the impact is, using an smbus message that instructs firmware to invalidate the eeprom. this mechanism uses an smbus message that the firmware is able to receive in all modes, no matter what the content of the eeprom is (even in diagnostic mode). after receiving this kind of message, firmware clears the signature of the eeprom in word 0x12 (bits 15/14 to 00b). afterwards, the bios/ operating system initiates a reset to force an eeprom auto-load process that fails in order to enable access to the 82576. firmware is programmed to receive such a command only from a pcie reset until one of the functions changes it?s status from d0u to d0a. once one of the functions moves to d0a, it can be safely assumed that the 82576 is accessible to the host and there is no further need for this function. this reduces the possibility of malicious software using this command as a back door and limits the time firmware must be active in non-manageability mode. if firmware is programmed not to do any other function apart from answering this command, it can request clock gating immediately after one of the functions changed its status from d0u to d0a. the command is sent on a fixed smbus address of 0xc8. the format of the command is the smbus write data byte as follows: note: this solution requires a controllable smbus connection to the 82576. if more than one the 82576 is in a state to accept this solution, all of the the 82576s' on the board acks this command and accepts it. a device supporting this mode does not ack this command if not in d0u state. the 82576 is guaranteed to accept the command on the smbus interface and on address 0xc8, but it might be accepted on other configured interfaces and addresses as well. after receiving a release eeprom command, firmware keeps its current state. it is the responsibility of the programmer that is updating the eeprom to send a firmware reset (if required) after the full eeprom update process completes. 3.3.1.7 eeprom-less support the 82576 supports eeprom-less operation with the following limitations: table 3-23. command format function command data byte release eeprom 0xc7 0xaa
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 114 ? non-manageability mode only. ? no support for legacy wake on lan (magic packets). ? no support for flash (no pxe code). ? no support for serial id pcie capability. ? no support for vital product data (vpd). ? all initialization values usually taken from the eeprom must be done by a custom host driver. ? intel sw drivers do not support eeprom-less operation. 3.3.1.7.1 access to the eeprom controlled feature the eearbc register enables access to registers that are not accessible via regular csr access (such as pcie configuration read-only registers) by emulating the auto-read process. eearbc contains three strobe fields that emulate the internal strobes of the internal auto-read process. this register is common to both functions and should be accessed only after the coordination with the other port. table 3-24 lists the strobe to be used when emulating a read of a specific word of the eeprom auto- read feature. table 3-24. strobes for eearbc auto-read emulation eeprom word emulated (in hex) content strobe for port 0 strobe for port 1 0:2 mac address valid_core0 valid_core1 0a/0f init control 1/2 valid_core0 valid_core1 0b/0c 1 1. if word 0xa was accessed before the subsystem or subvendor id are set, care must be taken that the load subsystem ids bit in word 0xa is set. sub-system device and vendor valid_common valid_common 1e/1d 2 dummy device id, rev id valid_common valid_common 21 function control valid_common valid_common 0d 2 2. if word 0xa was accessed before one of the device ids is set, care must be taken that the load device ids bit in word 0xa is set. device id port 0 valid_common n/a 11 2 device id port 1 n/a valid_common 10 sdp control n/a valid_core1 20 sdp control valid_core0 n/a 14 init control 3 n/a valid_core1 24 init control 3 valid_core0 n/a 15/16/18/19/1a/ 1b/22/25/26 pcie and nc-si configuration valid_common valid_common 1c/1f led control port 0 valid_core0 n/a 2a/2b 3 led control port 1 n/a valid_core1 2e 3 watchdog configuration valid_core0 valid_core1 2f vpd area n/a n/a
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 115 3.3.2 shared eeprom the 82576 uses a single eeprom device to configure hardware default parameters for both lan devices, including ethernet individual addresses (ia), led behaviors, receive packet filters for manageability, wake-up capability, etc. certain eeprom words are used to specify hardware parameters that are lan device-independent (such as those that affect circuit behavior). other eeprom words are associated with a specific lan device. both lan devices access the eeprom to obtain their respective configuration settings. 3.3.2.1 eeprom deadlock avoidance the eeprom is a shared resource between the following clients: ? hardware auto-read. ? port 0 lan driver accesses. ? port 1 lan driver accesses. ? firmware accesses. all clients can access the eeprom using parallel access, where hardware implements the actual access to the eeprom. hardware can schedule these accesses so that all clients get served without starvation. however, software and hardware clients can access the eeprom using bit banging. in this case, there is a request/grant mechanism that locks the eeprom to the exclusive usage of one client. if this client is stuck (without releasing the lock), the other clients are not able to access the eeprom. in order to avoid this, the 82576 implements a timeout mechanism, which releases the grant from a client that didn't toggle the eeprom bit-bang interface for more than two seconds. note: if an agent that was granted access to the eeprom for bit-bang access didn't toggle the bit bang interface for 500 ms, it should check if it still owns the interface before continuing the bit-banging. 3.3.2.2 eeprom map shared words the eeprom map in section 6.1 identifies those words configuring either lan devices or the entire intel? 82576 gbe controller component as ?both?. those words configuring a specific lan device parameter are identified by their lan number. the following eeprom words warrant additional notes specifically related to dual-lan support: 3. part of the parameters that can be configured through the eearbc register can be directly set through regular registers and t hus usage of this mechanism is not needed for them. specifically, words 0x2a, 0x2b and 0x2e controls only parameters that can be set through regular registers.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 116 3.3.3 vital product data (vpd) support the eeprom image might contain an area for vpd. this area is managed by the oem vendor and doesn?t influence the behavior of hardware. word 0x2f of the eeprom image contains a pointer to the vpd area in the eeprom. a value of 0xffff means vpd is not supported and the vpd capability doesn?t appear in the configuration space. the vpd area should be aligned to a dword boundary in the eeprom and should start in the first 1kbyte of the eeprom. the maximum area size is 256 bytes but can be smaller. the vpd block is built from a list of resources. a resource can be either large or small. the structure of these resources are listed in the following tables. table 3-25. notes on eeprom words ethernet address (ia) (shared between lans) the eeprom specifies the ia associated with the lan 0 device and used as the hardware default of the receive address registers for that device. the hardware-default ia for the lan 1 device is automatically determined by the same eeprom word and is set to the value of {ia lan 0 xor 0x010000000000}. initialization control 1, initialization control 2 (shared between lans) these eeprom words specify hardware-default values for parameters that apply a single value to both lan devices, such as link configuration parameters required for auto- negotiation, wake-up settings, pcie bus advertised capabilities, etc. initialization control 3 (unique to each lan) this eeprom word configures default values associated with each lan device?s hardware connections, including which link mode (internal phy, sgmii, serdes) is used with this lan device. because a separate eeprom word configures the defaults for each lan, extra care must be taken to ensure that the eeprom image does not specify a resource conflict. table 3-26. small resource structure offset 0 1 - n content tag = 0xxx, xyyyb (type = small(0), item name = xxxx, length = yyy bytes) data table 3-27. large resource structure offset 0 1 - 2 3 - n content tag = 1xxx, xxxxb (type = large(1), item name = xxxxxxxx) length data
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 117 the 82576 parses the vpd structure during the auto-load process (power up and pcie reset or warm reset) in order to detect the read-only and read/write area boundaries. the 82576 assumes the following vpd structure: note: the vpd-r and vpd-w structures can be in any order. if the 82576 doesn?t detect a value of 0x82 in the first byte of the vpd area, or the structure doesn?t follow the description listed in table 3-28 , it assumes the area is not programmed and the entire 256 bytes area is read only. if a vpd-w tag is found after the vpd-r tag, the area defined by it?s size is writable via the vpd structure. refer to the pci 3.0 specification (appendix i) for details of the different tags. in any case, the vpd area is accessible for read and write via the regular eeprom mechanisms pending the eeeprom protection capabilities enabled. for example, if vpd is in the protected area, the vpd area is not accessible to the software device driver (parallel or serial), but accessible through the vpd mechanism. if the vpd area is not in the protected area, then the software device driver can access all of it for read and write. the vpd area can be accessed through the pcie configuration space vpd capability structure described in section 9.5.4 . write accesses to a read-only area or any access outside of the vpd area via this structure are ignored. note: write access to dwords, which are only partially in the read/write area, are ignored. it is responsibility of vpd software to make the right alignment to enable a write to the entire area. 3.3.4 flash interface 3.3.4.1 flash interface operation the 82576 provides two different methods for software access to the flash. using the legacy flash transactions, the flash is read from or written to each time the host cpu performs a read or a write operation to a memory location that is within the flash address mapping or after a re-boot via accesses in the space indicated by the expansion rom base address register. all accesses to the flash require the appropriate command sequence for the device used. refer to the specific flash data sheet for more details on reading from or writing to flash. accesses to the flash are based on a direct decode of cpu accesses to a memory window defined in either: 1. the 82576's flash base address register (pcie control register at offset 0x14 or 0x18). table 3-28. vpd structure tag structure type length (bytes) data resource description 0x82 large length of identifier string identifier identifier string. 0x90 large length of ro area ro data vpd-r list containing one or more vpd keywords this part is optional and might not appear. 0x91 large length of r/ w area rw data vpd-w list containing one or more vpd keywords. this part is optional and might not appear. 0x78 small n/a n/a end tag.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 118 2. a certain address range of the ioaddr register defined by the io base address register (pcie control register at offset 0x18 or 0x20). 3. the expansion rom base address register (pcie control register at offset 0x30). the 82576 controls accesses to the flash when it decodes a valid access. note: flash read accesses must always be assembled by the 82576 each time the access is greater than a byte-wide access. the 82576 byte reads or writes to the flash take on the order of 2 ? s. the 82576 continues to issue retry accesses during this time. the 82576 supports only byte writes to the flash. another way for software to access the flash is directly using the flash's 4-wire interface through the flash access (fla) register. it can use this for reads, writes, or other flash operations (accessing the flash status register, erase, etc.). to directly access the flash, software should follow these steps: 1. write a 1b to the flash request bit (fla.fl_req). 2. read the flash grant bit (fla.fl_gnt) until it becomes 1b. it remains 0b as long as there are other accesses to the flash. 3. write or read the flash using the direct access to the 4-wire interface as defined in the fla register. the exact protocol used depends on the flash placed on the board and can be found in the appropriate datasheet. 4. write a 0b to the flash request bit (fla.fl_req). 3.3.4.2 flash write control the flash is write controlled by the fwe bits in the eeprom/flash control and data (eec) register. note that attempts to write to the flash device when writes are disabled (eec.fwe=01b ) should not be attempted. behavior after such an operation is undefined and can result in component and/or system hangs. after sending one byte write to the flash, software checks if it can send the next byte to write (check if the write process in the flash had finished) by reading the fla register, if bit (fla.fl_busy) in this register is set, the current write did not finish. if bit (fla.fl_busy) is clear then software can continue and write the next byte to the flash. 3.3.4.3 flash erase control when software needs to erase the flash, it should set bit fla.fl_er in the fla register to 1b (flash erase) and then set bits eec.fwe in the eeprom/flash control register to 0b. hardware gets this command and sends the erase command to the flash. the erase process finishes by itself. software should wait for the end of the erase process before any further access to the flash. this can be checked by using the flash write control mechanism previously described. the op-code used for erase operation is defined in the flashop register. note: sector erase by software is not supported. in order to delete a sector, the serial (bit bang) interface should be used.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 119 3.3.5 shared flash the 82576 provides an interface to an external serial flash/rom memory device, as described in section 2.1.2 . this flash/rom device can be mapped into memory and/or i/o address space for each lan device through the use of base address registers (bars). bit 13 of the eeprom initialization control word 3, associated with each lan device, selectively disables/enables whether the flash can be mapped for each lan device, by controlling the bar register advertisement and write ability. 3.3.5.1 flash access contention the 82576 implements internal arbitration between flash accesses initiated through the lan 0 device and those initiated through the lan 1 device. if accesses from both lan devices are initiated during the same approximate size window, the first one is served first and only then the next one. note: the 82576 does not synchronize between the two entities accessing the flash. contentions caused by one entity reading and the other modifying the same location is possible. to avoid this contention, accesses from both lan devices should be synchronized using external software synchronization of the memory or i/o transactions responsible for the access. it might be possible to ensure contention-avoidance by the nature of the software sequence. 3.3.5.2 flash deadlock avoidance the flash is a shared resource between the following clients: ? port 0 lan driver accesses. ? port 1 lan driver accesses. ? bios parallel access via expansion rom mechanism. ? firmware accesses. all clients can access the flash using parallel access, where hardware implements the actual access to the flash. hardware can schedule these accesses so that all the clients get served without starvation. however, the driver and firmware clients can access the serial flash using bit banging. in this case, there is a request/grant mechanism that locks the serial flash to the exclusive usage of one client. if this client is stuck without releasing the lock, the other clients are unable to access the flash. in order to avoid this, the 82576 implements a time-out mechanism that releases the grant from a client that doesn?t toggle the flash bit-bang interface for more than two seconds. note: if an agent that was granted access to the flash for bit-bang access doesn?t toggle the bit- bang interface for 500 ms, it should check that it still owns the interface before continuing the bit banging. this mode is enabled by bit five in word 0xa of the eeprom.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 120 3.4 configurable i/o pins 3.4.1 general-purpose i/o (software-definable pins) the 82576 has four software-defined pins (sdp pins) per port that can be used for miscellaneous hardware or software-controllable purposes. these pins and their function are bound to a specific lan device. for example, eight sdp pins cannot be associated with a single lan device. these pins can each be individually configurable to act as either input or output pins. the default direction of each of the four pins is configurable via the eeprom as well as the default value of any pins configured as outputs. to avoid signal contention, all four pins are set as input pins until after the eeprom configuration has been loaded. in addition to all four pins being individually configurable as inputs or outputs, they can be configured for use as general-purpose interrupt (gpi) inputs. to act as gpi pins, the desired pins must be configured as inputs. a separate gpi interrupt-detection enable is then used to enable rising-edge detection of the input pin (rising-edge detection occurs by comparing values sampled at the internal clock rate as opposed to an edge-detection circuit). when detected, a corresponding gpi interrupt is indicated in the interrupt cause register. the use, direction, and values of sdp pins are controlled and accessed using fields in the device control (ctrl) register and extended device control (ctrl_ext) register. the sdps can be used for special purpose mechanism such as watch dog indication (see section 3.4.2 for details) or ieee 1588 support. 3.4.2 software watchdog in some situations, it might be useful to give an indication to the manageability firmware or to external devices that the 82576 hardware or software device driver is not functional (because, in a pass-through nic, the 82576 can be bypassed if it is not functional). once the host driver is up and determines that the hardware is functional, the driver might reset the watchdog timer to indicate that the 82576 is functional. the driver then could re-arm the timer periodically. if the timer is not re-armed after a programmed timeout, an interrupt could be given to firmware and a pre-programmed sdp (sdp0[0] or sdp1[0]) could be raised. note that an sdp indication is shared between the ports. in addition, an icr[26] could be set to give a interrupt to the driver when a timeout is reached. the register controlling this feature is wdstp. this register enables the setting of a time-out period and the activation of this mode. both values get their default from the eeprom. re-arming of the timer is accomplished by setting the wdswsts.dev_functional bit. if the device driver needs to trigger the watchdog immediately because it suspects the 82576 is stuck, the driver can set the wdswsts.force_wd bit. it can also give firmware a reason indication by using the wdswsts.stuck_reason field. the watchdog feature provides the driver a way to indicate to the firmware that the 82576 is not functional. note that the watchdog feature has no logic to detect if hardware is not functional. if the 82576 is not functional, the watchdog timer expires due to the driver not being able to access the hardware, indicating a problem.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 121 the sdp associated with the watchdog indication is set using the ctrl.sdp0_wde bit. in this mode, the ctrl.sdp0_iodir should be set to output. the ctrl.sdp0_data bit indicates polarity. setting this bit in one core causes watchdog indications for both ports on the sdp. 3.4.2.1 watchdog re-arm after a watchdog indication was received, in order to rearm the mechanism the following flow should be used: 1. clear wd_enable bit in the wdstp register. 2. clear sdp0_wde bit in ctrl register. 3. set sdp0_wde bit in ctrl register. 4. set wd_enable in the wdstp register. 3.4.3 leds the 82576 provides four leds per port that can be used to indicate different statuses of the traffic. the default setup of the leds is done via eeprom words 0x1c, 0x1f for port 0 and words 0x2a, 0x2b for port 1. this setup is reflected in the ledctl register of each port. each software device driver can change its setup individually. for each of the leds the following parameters can be defined: ? mode: defines which information is reflected by this led. the encoding is described in the ledctl register. ? polarity: defines the polarity of the led. ? blink mode: determines whether or not the led should blink or be stable. in addition, the blink rate of all leds can be defined. the possible rates are 200 ms or 83 ms for each phase. there is one rate for all the leds of a port. 3.5 network interfaces 3.5.1 overview the 82576 mac provides a complete csma/cd function supporting ieee 802.3 (10 mb/s), 802.3u (100 mb/s), 802.3z and 802.3ab (1000 mb/s) implementations. the 82576 performs all of the functions required for transmission, reception, and collision handling called out in the standards. each 82576 mac can be configured to use a different media interface. the 82576 supports the following potential configurations: ? internal copper phy. ? external serdes device such as an optical serdes (sfp or on board) or backplane connections. ? external sgmii device. this mode is used for sfp connections or external sgmii phys. selection between the various configurations is programmable via each mac's extended device control register ( ctrl_ext.link_mode bits) and default is set via eeprom settings. table 3-29 lists the encoding on the link_mode field for each of the modes.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 122 the gmii/mii interface used to communicate between the mac and the internal phy or the sgmii pcs supports 10/100/1000 mb/s operation, with both half- and full-duplex operation at 10/100 mb/s, and full-duplex operation at 1000 mb/s. the serdes function can be used to implement a fiber-optics-based solution or backplane connection without requiring an external tbi mode transceiver/serdes. the sgmii interface can be used to connect to sfp modules. as such, this sgmii interface has the following limitations: ? no tx clock ? ac coupling only the internal copper phy features 10/100/1000-baset signaling and is capable of performing intelligent power-management based on both the system power-state and lan energy-detection (detection of unplugged cables). power management includes the ability to shut-down to an extremely low (powered-down) state when not needed, as well as the ability to auto-negotiate to lower-speed (and less power-hungry) 10/100 mb/s operation when the system is in low power-states. 3.5.2 mac functionality 3.5.2.1 internal gmii/mii interface the 82576?s mac and phy/pcs communicate through an internal gmii/mii interface that can be configured for either 1000 mb/s operation (gmii) or 10/100 mb/s (mii) mode of operation. for proper network operation, both the mac and phy must be properly configured (either explicitly via software or via hardware auto-negotiation) to identical speed and duplex settings. all mac configuration is performed using device control registers mapped into system memory or i/o space; an internal mdio/mdc interface, accessible via software, is used to configure the phy operation. 3.5.2.2 mdio/mdc the 82576 implements an ieee 802.3 mii management interface (also known as the management data input/output or mdio interface) between the mac and the phy. this interface provides the mac and software the ability to monitor and control the state of the phy. the mdio interface defines a physical connection, a special protocol that runs across the connection, and an internal set of addressable registers. the internal or external interface consists of a data line (mdio) and clock line (mdc), which are accessible by software via the mac register space. ? mdc (management data clock): this signal is used by the phy as a clock timing reference for information transfer on the mdio signal. the mdc is not required to be a continuous signal and can table 3-29. link mode encoding link mode 82576 mode 00b internal phy 01b reserved 10b sgmii 11b serdes
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 123 be frozen when no management data is transferred. the mdc signal has a maximum operating frequency of 2.5 mhz. ? mdio (management data i/o): this internal signaling between the mac and phy logically represents a bi-directional data signal is used to transfer control information and status to and from the phy (to read and write the phy management registers). asserting and interpreting value(s) on this interface requires knowledge of the special mdio protocol to avoid possible internal signal contention or miscommunication to/from the phy. software can use mdio accesses to read or write registers in internal phy mode by accessing the 82576's mdic register (see section 8.2.4 ). when working in sgmii/serdes mode, the external phy (if it exists) can be accessed either through mdc/mdio as previously described, or via a two wire interface bus using the i2ccmd register (see section 8.18.8 ). the two wire interface bus or the mdc/mdio bus are connected via the same pins, and thus are mutually exclusive. in order to be able to control an external device, either by sfp or mdc/ mdio, the i 2 c sfp enable bit in initialization control 3 eeprom word should be set. as the mdc/mdio command can be targeted either to the internal phy or to an external bus, the mdic.destination bit is used to define the target of the transaction. note: each port has its own mdc/mdio or two wire interface bus and there is no sharing between the ports of the control port. in order to control both ports? phys, via the same control bus, accesses to both phys should be done via the same port with different device addresses. 3.5.2.2.1 mdic register usage for an mdi read cycle, the sequence of events is as follows: 1. the processor performs a pcie write cycle to the mii register with: ? ready = 0b ? interrupt enable set to 1b or 0b ? opcode = 10b (read) ? phyadd = phy address from the mdi register ? regadd = register address of the specific register to be accessed (0 through 31). 2. the mac applies the following sequence on the mdio signal to the phy: <01><10> where z stands for the mac tri-stating the mdio signal. 3. the phy returns the following sequence on the mdio signal: <0>. 4. the mac discards the leading bit and places the following 16 data bits in the mii register. 5. the the 82576 asserts an interrupt indicating mdi ?done? if the interrupt enable bit was set. 6. the the 82576 sets the ready bit in the mii register indicating the read is complete. 7. the processor might read the data from the mii register and issue a new mdi command. for a mdi write cycle, the sequence of events is as follows: 1. ready = 0b. 2. interrupt enable set to 1b or 0b. 3. opcode = 01b (write).
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 124 4. phyadd = phy address from the mdi register. 5. regadd = register address of the specific register to be accessed (0 through 31). 6. data = specific data for desired control of the phy. 7. the mac applies the following sequence on the mdio signal to the phy: <01><01><10> 8. the the 82576 asserts an interrupt indicating mdi ?done? if the interrupt enable bit was set. 9. the the 82576 sets the ready bit in the mii register to indicate that the write operation completed. 10. the cpu might issue a new mdi command. note: an mdi read or write might take as long as 64 ? s from the processor write to the ready bit assertion. if an invalid opcode is written by software, the mac does not execute any accesses to the phy registers. if the phy does not generate a 0b as the second bit of the turn-around cycle for reads, the mac aborts the access, sets the e (error) bit, writes 0xffff to the data field to indicate an error condition, and sets the ready bit. note: after a phy reset, access through the mdic register should not be attempted for 300 ? sec. 3.5.2.3 duplex operation with copper phy the 82576 supports half-duplex and full-duplex 10/100 mb/s mii mode either through the internal copper phy or sgmii interface. however, only full-duplex mode is supported when serdes mode is used or in any 1000 mb/s connection. configuration of the duplex operation of the 82576 can either be forced or determined via the auto- negotiation process. see section 3.5.4.3 for details on link configuration setup and resolution. 3.5.2.3.1 full duplex all aspects of the ieee 802.3, 802.3u, 802.3z, and 802.3ab specifications are supported in full-duplex operation. full-duplex operation is enabled by several mechanisms, depending on the speed configuration of the 82576 and the specific capabilities of the link partner used in the application. during full-duplex operation, the 82576 can transmit and receive packets simultaneously across the link interface. in full-duplex, transmission and reception are delineated independently by the gmii/mii control signals. transmission starts tx_en is asserted, which indicates there is valid data on the tx_data bus driven from the mac to the phy/pcs. reception is signaled by the phy/pcs by the asserting the rx_dv signal, which indicates valid receive data on the rx_data lines to the mac. 3.5.2.3.2 half duplex in half-duplex operation, the mac attempts to avoid contention with other traffic on the link by monitoring the crs signal provided by the phy and deferring to passing traffic. when the crs signal is de-asserted or after a sufficient inter-packet gap (ipg) has elapsed after a transmission, frame transmission begins. the mac signals the phy/pcs with tx_en at the start of transmission.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 125 in the case of a collision, the phy/sgmii detects the collision and asserts the col signal to the mac. frame transmission stops within four link clock times and then the 82576 sends a jam sequence onto the link. after the end of a collided transmission, the 82576 backs off and attempts to re-transmit per the standard csma/cd method. note: the re-transmissions are done from the data stored internally in the 82576 mac transmit packet buffer (no re-access to the data in host memory is performed). the mac behavior is different if a regular collision or a late collision is detected. if a regular collision is detected, the mac always tries to re-transmit until the number of excessive collisions is reached. in case of late collision, the mac retransmission is configurable. in addition, statistics are gathered on late collisions. in the case of a successful transmission, the 82576 is ready to transmit any other frame(s) queued in the mac's transmit fifo, after the minimum inter-frame spacing (ifs) of the link has elapsed. during transmit, the phy is expected to signal a carrier-sense (assert the crs signal) back to the mac before one slot time has elapsed. the transmission completes successfully even if the phy fails to indicate crs within the slot time window. if this situation occurs, the phy can either be configured incorrectly or be in a link down situation. such an event is counted in the transmit without crs statistic register (see section 8.19.11 ). 3.5.3 serdes, sgmii support the 82576 can be configured to follow either sgmii, serdes standards. when in sgmii mode, the 82576 can be configured to operate in 1 gb/s, 100 mb/s or 10 mb/s speeds. when in the 10/100 mb/s speed, they can be configured to half-duplex mode of operation. when configured for serdes operation, the port supports only 1 gb/s, full-duplex operation. since the serial interfaces are defined as differential signals, internally the hardware has analog and digital blocks. following is the initialization/ configuration sequence for the analog and digital blocks. 3.5.3.1 serdes analog block the analog block may require some changes to it?s configuration registers in order to work properly. there is no special requirement for designers to do these changes as the hardware internally updates the configuration using a default sequence or a sequence loaded from the eeprom. there is a provision for eeprom-less systems, where software can generate the same changes that the hardware generates by writing the initialization sequence through the scctl register. 3.5.3.2 serdes/sgmii pcs block the link setup for serdes and sgmii are described in sections 3.5.4.1 and 3.5.4.2 , respectively. 3.5.3.3 gbe physical coding sub-layer (pcs) the 82576 integrates the 802.3z pcs function on-chip. the on-chip pcs circuitry is used when the link interface is configured for serdes or sgmii operation and is bypassed for internal phy mode. the packet encapsulation is based on the fiber channel (fc0/fc1) physical layer and uses the same coding scheme to maintain transition density and dc balance. the physical layer device is the serdes and is used for 1000base-sx, -l-, or -cx configurations.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 126 3.5.3.3.1 8b10b encoding/decoding the gbe pcs circuitry uses the same transmission-coding scheme used in the fiber channel physical layer specification. the 8b10b-coding scheme was chosen by the standards committee in order to provide a balanced, continuous stream with sufficient transition density to allow for clock recovery at the receiving station. there is a 25% overhead for this transmission code, which accounts for the data- signaling rate of 1250 mb/s with 1000 mb/s of actual data. 3.5.3.3.2 code groups and ordered sets code group and ordered set definitions are defined in clause 36 of the ieee 802.3z standard. these represent special symbols used in the encapsulation of gbe packets. the following table contains a brief description of defined ordered sets and included for informational purposes only. see clause 36 of the ieee 802.3z specification for more details. table 3-30. brief description of defined ordered sets code ordered_set # of code groups usage /c/ configuration 4 general reference to configuration ordered sets, either /c1/ or /c2/, which is used during auto-negotiation to advertise and negotiate link operation information between link partners. last 2 code groups contain configuration base and next page registers. /c1/ configuration 1 4 see /c/. differs from /c2 / in 2nd code group for maintaining proper signaling disparity 1 . 1. the concept of running disparity is defined in the standard. in summary, this refers to the 1-0 and 0-1 transitions within 8b 10b code groups. /c2/ configuration 2 4 see /c/. differs from /c1 / in 2nd code group for maintaining proper signaling disparity 1 . /i/ idle 2 general reference to idle ordered sets. idle characters are continually transmitted by the end stations and are replaced by encapsulated packet data. the transitions in the idle stream enable the serdes to maintain clock and symbol synchronization between link partners. /i1/ idle 1 2 see /i/. differs from /i2 / in 2nd code group for maintaining proper signaling disparity 1 . /i2/ idle 2 2 see /i/. differs from /i1 / in 2nd code group for maintaining proper signaling disparity 1 . /r/ carrier_extend 1 this ordered set is used to indicate carrier extension to the receiving pcs. it is also used as part of the end_of_packet encapsulation delimiter as well as ipg for packets in a burst of packets. /s/ start_of_packet 1 the spd (start_of_packet delimiter) ordered set is used to indicate the starting boundary of a packet transmission. this symbol replaces the last byte of the preamble received from the mac layer. /t/ end_of_packet 1 the epd (end_of_packet delimiter) is comprised of three ordered sets. the /t/ symbol is always the first of these and indicates the ending boundary of a packet. /v/ error_propagation 1 the /v/ ordered set is used by the pcs to indicate error propagation between stations. this is normally intended to be used by repeaters to indicate collisions.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 127 3.5.4 auto-negotiation and link setup features the method for configuring the link between two link partners is highly dependent on the mode of operation as well as the functionality provided by the specific physical layer device (phy or serdes). for serdes mode, the 82576 provides the complete 802.3z pcs function. for internal phy mode, the pcs and auto-negotiation functions are maintained within the phy. for sgmii mode, the 82576 supports the sgmii link auto-negotiation process, whereas the link auto-negotiation is done by the external phy. configuring the link can be accomplished by several methods ranging from software forcing link settings, software-controlled negotiation, mac-controlled auto-negotiation, to auto-negotiation initiated by a phy. the following sections describe processes of bringing the link up including configuration of the 82576 and the transceiver, as well as the various methods of determining duplex and speed configuration. the process of determining link configuration differs slightly based on the specific link mode (internal phy, external serdes or sgmii) being used. when operating in a serdes mode, the pcs layer performs auto-negotiation per clause 37 of the 802.3z standard. the transceiver used in this mode (the serdes) does not participate in the auto-negotiation process as all aspects of auto-negotiation are controlled by the 82576. when operating in internal phy mode, the phy performs auto-negotiation per 802.3ab clause 40 and extensions to clause 28. link resolution is obtained by the mac from the phy after the link has been established. the mac accomplishes this via the mdio interface, via specific signals from the internal phy to the mac, or by mac auto-detection functions. when operating in sgmii mode, the pcs layer performs sgmii auto-negotiation per the sgmii specification. the external phy is responsible for the ethernet auto-negotiation process. 3.5.4.1 serdes link configuration when using serdes link mode, link mode configuration can be performed using the pcs function in the 82576. the hardware supports both hardware and software auto-negotiation methods for determining the link configuration, as well as allowing for a manual configuration to force the link. hardware auto- negotiation is the preferred method. 3.5.4.1.1 signal detect indication the srds_0/1_sig_det pins can be connected to a signal detect or loss-of-signal output that indicates when no laser light is being received when the 82576 is used in a 1000base-sx or -lx implementation (serdes operation). it prevents false carrier cases occurring when transmission by a non connected port couples in to the input. unfortunately, there is no standard polarity for this signal coming from different manufacturers. the ctrl.ilos bit provides for inversion of the signal from different external serdes vendors, and should be set when the external serdes provides a negative- true loss-of-signal. note: this bit also inverts the link input that provides link status indication from the phy (in gmii/mii mode) and thus should be set to 0 for proper internal phy operation. 3.5.4.1.2 mac link speed
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 128 serdes operation is only defined for 1000 mb/s operation. other link speeds are not supported. when configured for the serdes interface, the mac speed-determination function is disabled and the device status register bits (status.speed) indicate a value of 10b for 1000 mb/s. 3.5.4.1.3 serdes mode auto-negotiation in serdes mode, after power up or the 82576 reset via perst, the 82576 initiates auto-negotiation based on the default settings in the device control and transmit configuration or pcs link control word registers, as well as settings read from the eeprom. if enabled in the eeprom, the 82576 immediately performs auto-negotiation. tbi mode auto-negotiation, as defined in clause 37 of the ieee 802.3z standard, provides a protocol for two devices to advertise and negotiate a common operational mode across a gbe link. the 82576 fully supports the ieee 802.3z auto-negotiation function when using the on-chip pcs and internal serdes. tbi mode auto-negotiation is used to determine the following information: ? duplex resolution (even though the 82576 mac only supports full-duplex in serdes mode). ? flow control configuration. note: since speed for serdes modes is fixed at 1000 mb/s, speed settings in the device control register are unaffected by the auto-negotiation process. auto-negotiation can be initiated at power up or asserting perst# by enabling specific bits in the eeprom. the auto-negotiation process is accomplished by the exchange of /c/ ordered sets that contain the capabilities defined in the pcs_anadv register in the 3rd and 4th symbols of the ordered sets. next page are supported using the pcs_nptx_an register. bits fd and lu in the device status (status) register, and bits in the pcs_lsts register provide status information regarding the negotiated link. auto-negotiation can be initiated by the following: ? pcs_lcmd.an_enable transition from 0b to 1b ? receipt of /c/ ordered set during normal operation ? receipt of a different value of the /c/ ordered set during the negotiation process ? transition from loss of synchronization to synchronized state (if an_enable is set). ? pcs_lcmd.an_restart transition from 0b to 1b resolution of the negotiated link determines device operation with respect to flow control capability and duplex settings. these negotiated capabilities override advertised and software-controlled device configuration. software must configure the pcs_anadv fields to the desired advertised base page. the bits in the device control register are not mapped to the txconfigword field in hardware until after auto- negotiation completes. table 3-31 lists the mapping of the pcs_anadv fields to the config_reg base page encoding per clause 37 of the standard. table 3-31. 802.3z advertised base page mapping 15 14 13:12 11:9 8:7 6 5 4:0 nextp ack rflt rsv asm hd fd rsv
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 129 the partner advertisement can be seen in the pcs_ lpab and pcs_ lpabnp registers. 3.5.4.1.4 forcing link forcing link can be accomplished by software by writing a 1b to ctrl.slu , which forces the mac pcs logic into a link-up state (enables listening to incoming characters when los is de-asserted by the internal or external serdes). note: the pcs_lcmd.an_enable bit must be set to a logic zero to enable forcing link. when link is forced via the ctrl.slu bit, the link does not come up unless the los signal is asserted or an energy indication is received from the serdes receiver, implying that there is a valid signal being received by the optics or the serdes. the source of the signal detect is fixed using bit enrgsrc in the connsw register. 3.5.4.1.5 hw detection of non-auto-negotiation partner hardware can detect a serdes partner that sends idle code groups continuously, but does not initiate or answer an auto-negotiation process. in this case, hardware initiates an auto-negotiation process, and if it fails after some timeout, a link up is assumed. to enable this functionality the pcs_lctl.an_timeout_en bit should be set. this mode can be used instead of the force link mode as a way to support a partner that do not support auto-negotiation. 3.5.4.2 sgmii link configuration when working in sgmii mode, the actual link setting is done by the external phy and is dependent on the settings of this phy. the sgmii auto-negotiation process described in the sections that follow is only used to establish the mac/phy connection. 3.5.4.2.1 sgmii auto-negotiation this auto-negotiation process is not dependent on the srds0/1_sig_det signal, as this signal indicates the status of the phy signal detection (usually used in an optical phy). the outcome of this auto-negotiation process includes the following information: ? link status ? speed ? duplex this information is used by hardware to configure the mac, when operating in sgmii mode. bits fd and lu of the device status (status) register and bits in the pcs_lsts register provide status information regarding the negotiated link. auto-negotiation can be initiated by the following: ? lrst transition from b1 to 0b. ? pcs_lcmd.an_enable transition from 0b to 1b. ? receipt of /c/ ordered set during normal operation. ? receipt of different value of the /c/ ordered set during the negotiation process.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 130 ? transition from loss of synchronization to a synchronized state (if an_enable is set). ? pcs_lcmd.an_restart transition from 0b to 1b. resolving the negotiated link determines the 82576 operation with respect to speed and duplex settings. these negotiated capabilities override advertised and software controlled device configuration. when working in sgmii mode, there is no need to set the pcas_anadv register, as the mac advertisement word is fixed. the result of the sgmii level auto-negotiation can be read from the pcs_lpab register. 3.5.4.2.2 forcing link in sgmii, forcing of the link cannot be done at the pcs level, only in the external phy. the forced speed and duplex settings are reflected by the sgmii auto-negotiation process; the mac settings are automatically done according to this functionality. 3.5.4.2.3 mac speed resolution the mac speed and duplex settings are always set according to the sgmii auto-negotiation process. 3.5.4.3 copper phy link configuration when operating with the internal phy, link configuration is generally determined by phy auto- negotiation. the software device driver must intervene in cases where a successful link is not negotiated or the designer desires to manually configure the link. the following sections discuss the methods of link configuration for copper phy operation. 3.5.4.3.1 phy auto-negotiation (speed, duplex, flow control) when using a copper phy, the phy performs the auto-negotiation function. the actual operational details of this operation are described in the ieee p802.3ab draft standard and are not included here. auto-negotiation provides a method for two link partners to exchange information in a systematic manner in order to establish a link configuration providing the highest common level of functionality supported by both partners. once configured, the link partners exchange configuration information to resolve link settings such as: ? speed: - 10/100/1000 mb/s ? duplex: - full or half ? flow control operation phy specific information required for establishing the link is also exchanged. note: if flow control is enabled in the 82576, the settings for the desired flow control behavior must be set by software in the phy registers and auto-negotiation restarted. after auto- negotiation completes, the software device driver must read the phy registers to determine the resolved flow control behavior of the link and reflect these in the mac register settings (ctrl.tfce and ctrl.rfce). once phy auto-negotiation completes, the phy asserts a link indication (link) to the mac. software must have set the set link up bit in the device control register (ctrl.slu) before the mac recognizes the link indication from the phy and can consider the link to be up.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 131 3.5.4.3.2 mac speed resolution for proper link operation, both the mac and phy must be configured for the same speed of link operation. the speed of the link can be determined and set by several methods with the 82576. these include: ? software-forced configuration of the mac speed setting based on phy indications, which might be determined as follows: ? software reads of phy registers directly to determine the phy's auto-negotiated speed ? software reads the phy's internal phy-to-mac speed indication (spd_ind) using the mac status.speed register ? software asks the mac to attempt to auto-detect the phy speed from the phy-to-mac rx_clk, then programs the mac speed accordingly ? mac automatically detects and sets the link speed of the mac based on phy indications by using the phy's internal phy-to-mac speed indication (spd_ind) aspects of these methods are discussed in the sections that follow. 3.5.4.3.2.1 forcing mac speed there might be circumstances when the software device driver must forcibly set the link speed of the mac. this can occur when the link is manually configured. to force the mac speed, the software device driver must set the ctrl.frcspd (force-speed) bit to 1b and then write the speed bits in the device control register ( ctrl.speed ) to the desired speed setting. see section 8.2.1 for details. note: forcing the mac speed using ctrl.frcspd overrides all other mechanisms for configuring the mac speed and can yield non-functional links if the mac and phy are not operating at the same speed/configuration. when forcing the 82576 to a specific speed configuration, the software device driver must also ensure the phy is configured to a speed setting consistent with mac speed settings. this implies that software must access the phy registers to either force the phy speed or to read the phy status register bits that indicate link speed of the phy. note: forcing speed settings by ctrl.speed can also be accomplished by setting the ctrl_ext.spd_byps bit. this bit bypasses the mac's internal clock switching logic and enables the software device driver complete control of when the speed setting takes place. the ctrl.frcspd bit uses the mac's internal clock switching logic, which does delay the affect of the speed change. 3.5.4.3.2.2 using internal phy direct link-speed indication the 82576?s internal phy provides a direct internal indication of its speed to the mac (spd_ind). when using the internal phy, the most direct method for determining the phy link speed and either manually or automatically configuring the mac speed is based on these direct speed indications. for mac speed to be set/determined from these direct internal indications from the phy, the mac must be configured such that ctrl.asde and ctrl.frcspd are both 0b (both auto-speed detection and forced-speed override disabled). after configuring the device control register, mac speed is re- configured automatically each time the phy indicates a new link-up event to the mac. when mac speed is neither forced nor auto-sensed by the mac, the current mac speed setting and the speed indicated by the phy is reflected in the device status register bits status.speed .
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 132 3.5.4.3.3 mac full-/half- duplex resolution the duplex configuration of the link is also resolved by the phy during the auto-negotiation process. the 82576?s internal phy provides an internal indication to the mac of the resolved duplex configuration using an internal full-duplex indication (fdx). when using the internal phy, this internal duplex indication is normally sampled by the mac each time the phy indicates the establishment of a good link (link indication). the phy's indicated duplex configuration is applied in the mac and reflected in the mac device status register ( status.fd ). software can override the duplex setting of the mac via the ctrl.fd bit when the ctrl.frcdplx (force duplex) bit is set. if ctrl.frcdplx is 0b, the ctrl.fd bit is ignored and the phy's internal duplex indication is applied. 3.5.4.3.4 using phy registers the software device driver might be required under some circumstances to read from, or write to, the mii management registers in the phy. these accesses are performed via the mdic registers (see section 8.2.4 ). the mii registers enable the software device driver to have direct control over the phy's operation, which can include: ? resetting the phy ? setting preferred link configuration for advertisement during the auto-negotiation process ? restarting the auto-negotiation process ? reading auto-negotiation status from the phy ? forcing the phy to a specific link configuration the set of phy management registers required for all phy devices can be found in the ieee p802.3ab draft standard. the registers for the 82576 phy are described in section 3.5.8 . 3.5.4.3.5 comments regarding forcing link forcing link in gmii/mii mode (internal phy) requires the software device driver to configure both the mac and phy in a consistent manner with respect to each other as well as the link partner. after initialization, the software device driver configures the desired modes in the mac, then accesses the phy registers to set the phy to the same configuration. before enabling the link, the speed and duplex settings of the mac can be forced by software using the ctrl.frcspd , ctrl.frcdpx , ctrl.speed , and ctrl.fd bits. after the phy and mac have both been configured, the software device driver should write a 1b to the ctrl.slu bit. 3.5.4.4 loss of signal/link status indication for all modes of operation, an los/link signal provides an indication of physical link status to the mac. when the mac is configured for optical serdes mode, the input reflects loss-of-signal connection from the optics. in backplane mode, where there is no los external indication, an internal indication from the serdes receiver can be used. in sfp systems the los indication from the sfp can be used. in internal phy mode, this signal from the phy indicates whether the link is up or down; typically indicated after successful auto-negotiation. assuming that the mac has been configured with ctrl.slu =1b, the mac status bit status.lu , when read, generally reflects whether the phy or serdes has link (except under forced-link setup where even the phy link indication might have been forced).
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 133 when the link indication from the phy is de-asserted or the loss-of-signal asserted from the serdes, the mac considers this to be a transition to a link-down situation (such as cable unplugged, loss of link partner, etc.). if the link status change (lsc) interrupt is enabled, the mac generates an interrupt to be serviced by the software device driver. 3.5.5 ethernet flow control (fc) the 82576 supports flow control as defined in 802.3x as well as the specific operation of asymmetrical flow control defined by 802.3z. flow control is implemented as a means of reducing the possibility of receive buffer overflows, which result in the dropping of received packets, and allows for local controlling of network congestion levels. this can be accomplished by sending an indication to a transmitting station of a nearly full receive buffer condition at a receiving station. the implementation of asymmetric flow control allows for one link partner to send flow control packets while being allowed to ignore their reception. for example, not required to respond to pause frames. the following registers are defined for the implementation of flow control: ? ctrl.rfce field is used to enable reception of legacy flow control packets and reaction to them. ? ctrl.tfce field is used to enable transmission of legacy flow control packets. ? flow control address low, high (fcal/h) - 6-byte flow control multicast address ? flow control type (fct) 16-bit field to indicate flow control type ? flow control bits in device control (ctrl) register - enables flow control modes. ? discard pause frames (dpf) and pass mac control frames (pmcf) in rctl - controls the forwarding of control packets to the host. ? flow control receive threshold high (fcrth[1:0]) - a set of 13-bit high watermarks indicating receive buffer fullness. a single watermark is used in link fc mode. ? flow control receive threshold low (fcrtl[1:0]) - a set of 13-bit low watermarks indicating receive buffer emptiness. a single watermark is used in link fc mode. ? flow control transmit timer value (fcttv) - a set of 16-bit timer values to include in transmitted pause frame. a single timer is used in link fc mode. ? flow control refresh threshold value (fcrtv) - 16-bit pause refresh threshold value 3.5.5.1 mac control frames and receiving flow control packets 3.5.5.1.1 structure of 802.3x fc packets three comparisons are used to determine the validity of a flow control frame: 1. a match on the 6-byte multicast address for mac control frames or to the station address of the 82576 (receive address register 0). 2. a match on the type field 3. a comparison of the mac control op-code field the 802.3x standard defines the mac control frame multicast address as 01-80-c2-00-00-01. the type field in the fc packet is compared against an ieee reserved value of 0x8808.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 134 the final check for a valid pause frame is the mac control op-code. at this time only the pause control frame op-code is defined. it has a value of 0x0001. frame-based flow control differentiates xoff from xon based on the value of the pause timer field. non-zero values constitute xoff frames while a value of zero constitutes an xon frame. values in the timer field are in units of pause quantum (slot time). a pause quantum lasts 64 byte times, which is converted in absolute time duration according to the line speed. note: xon frame signals the cancellation of the pause from initiated by an xoff frame - pause for zero pause quantum. table 3-32 lists the structure of a 802.3x fc packet 3.5.5.1.2 operation and rules the 82576 operates in link fc. ? link fc is enabled by the rfce bit in the ctrl register. note: link flow control capability is negotiated between link partners via the auto negotiation process. it is the software device driver responsibility to reconfigure the link flow control configuration after the capabilities to be used where negotiated as it might modify the value of these bits based on the resolved capability between the local device and the link partner. receiving a link fc frame while in pfc mode might be ignored. receiving a pfc frame while in link fc mode is ignored. once the receiver has validated receiving an xoff, or pause frame, the 82576 performs the following: ? increments the appropriate statistics register(s). ? sets the flow_control state bit in the relevant fcsts[0-1] register. ? initializes the pause timer based on the packet's pause timer field (overwriting any current timer?s value). ? disables packet transmission or schedules the disabling of transmission after the current packet completes. resumption of transmission might occur under the following conditions: ? expiration of the pause timer ? reception of an xon frame (a frame with its pause timer set to 0b) both conditions clear the relevant flow_control state bit in the relevant fcsts[0-1] register and transmission can resume. hardware records the number of received xon frames. table 3-32. 802.3x packet format da 01_80_c2_00_00_01 (6 bytes) sa port mac address (6 bytes) type 0x8808 (2 bytes) op-code 0x0001 (2 bytes) time xxxx (2 bytes) pad 42 bytes crc 4 bytes
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 135 3.5.5.1.3 timing considerations when operating at 1 gb/s line speed, the 82576 must not begin to transmit a (new) frame more than two pause-quantum-bit times after receiving a valid link xoff frame, as measured at the wires. a pause quantum is 512-bit times. when operating in full duplex at 100 mb/s or 1 gb/s line speeds, the 82576 must not begin to transmit a (new) frame more than 576-bit times after receiving a valid link xoff frame, as measured at the wire. 3.5.5.2 pause and mac control frames forwarding two bits in the receive control register, control forwarding of pause and mac control frames to the host. these bits are discard pause frames ( dpf ) and pass mac control frames ( pmcf ): ? the dpf bit controls forwarding of pause packets to the host. ? the pmcf bit controls forwarding of non-pause packets to the host. note: when virtualization is enabled, forwarded control packets are queued according to the regular switching procedure defined in section 7.10.3.5 . when flow control reception is disabled (ctrl.rfce = 0), flow control packets are not recognized and are parsed as regular packets. 3.5.5.3 transmission of pause frames the 82576 generates pause packets to insure there is enough space in its receive packet buffers to avoid packet drop. the 82576 monitors the fullness of its receive packet buffers and compares it with the contents of a programmable threshold. when the threshold is reached, the 82576 sends a pause frame. the 82576 also supports the sending of link flow control (fc). note: similar to receiving link flow control packets previously mentioned, link xoff packets can be transmitted only if this configuration has been negotiated between the link partners via the auto-negotiation process or some higher level protocol. the setting of this bit by the software device driver indicates the desired configuration. table 3-33. forwarding of pause packet to host ( dpf bit) rfce dpf are fc packets forwarded to host? 0 x yes. packets needs to pass the l2 filters (see section 7.1.2.1 ). 1 1. the flow control multicast address is not part of the l2 filtering unless explicitly required. 1 0 yes. packets needs to pass the l2 filters (see section 7.1.2.1 ). 1 1 no. table 3-34. transfer of non-pause control packets to host ( pmcf bit) rfce pmcf are non-fc mac control packets forwarded to host? 0 x yes. packets needs to pass the l2 filters (see section 7.1.2.1 ). x 0 yes. packets needs to pass the l2 filters (see section 7.1.2.1 ). 1 1 reserved.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 136 the transmission of flow control frames should only be enabled in full-duplex mode per the ieee 802.3 standard. software should ensure that the transmission of flow control packets is disabled when the 82576 is operating in half-duplex mode. 3.5.5.3.1 operation and rules transmission of link pause frames is enabled by software writing a 1b to the tfce bit in the device control register. the content of the flow control receive threshold high (fcrth) register determines at what point the 82576 first transmits a pause frame. the 82576 monitors the fullness of the receive packet buffer and compares it with the contents of fcrth. when the threshold is reached, the 82576 sends a pause frame with its pause time field equal to fcttv. at this time, the 82576 starts counting an internal shadow counter (reflecting the pause timeout counter at the partner end) from zero. when the counter reaches the value indicated in fcrtv register, then, if the pause condition is still valid (meaning that the buffer fullness is still above the high watermark), an xoff message is sent again. once the receive buffer fullness reaches the low water mark, the 82576 sends an xon message (a pause frame with a timer value of zero). software enables this capability with the xone field of the fcrtl. the 82576 sends a pause frame if it has previously sent one and the packet buffer overflows. this is intended to minimize the amount of packets dropped if the first pause frame did not reach its target. since the secure receive packets use the same data path, the behavior is identical when secure packets are received. 3.5.5.3.2 software initiated pause frame transmission the 82576 has the added capability to transmit an xoff frame via software. this is accomplished by software writing a 1b to the swxoff bit of the transmit control register. once this bit is set, hardware initiates the transmission of a pause frame in a manner similar to that automatically generated by hardware. the swxoff bit is self-clearing after the pause frame has been transmitted. note: the flow control refresh threshold mechanism does not work in the case of software- initiated flow control. therefore, it is the software?s responsibility to re-generate pause frames before expiration of the pause counter at the other partner's end. the state of the ctrl.tfce bit or the negotiated flow control configuration does not affect software generated pause frame transmission. note: software sends an xon frame by programming a 0b in the pause timer field of the fcttv register. the software emission of xon packet is not allowed while the hardware flow control mechanism is active, as both use the fctiv registers for different purposes. xoff transmission is not supported in 802.3x for half-duplex links. software should not initiate an xoff or xon transmission if the 82576 is configured for half-duplex operation. when flow control is disabled, pause packets (xon, xoff, and other fc) are not detected as flow control packets and can be counted in a variety of counters (such as multicast).
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 137 3.5.5.4 ipg control and pacing the 82576 supports the following modes of controlling ipg duration: ? fixed ipg - ipg is extended by a fixed duration ? limiting payload rate - ipg is extended to limit the average data rate on the link. 3.5.5.4.1 fixed ipg extension the 82576 allows controlling of the ipg duration. the ipgt configuration field enables an extension of ipg in 4-byte increments. one possible use of this capability is to allow the insertion of bytes into the transmit packet after it has been transmitted by the 82576 without violating the minimum ipg requirements. for example, a security device connected in series to the 82576 might add security headers to transmit packets before the packets go to the network. 3.5.5.4.2 limiting payload rate the 82576 allows controlling the maximum payload rate transmitted on the wire. frames are spaced by an amount of idle time proportional to the maximum rate to achieve and to the length of the last frame transmitted. this feature is enabled by clearing bits tcrsbyp and tcrscomp in the rttpcs register. the maximum payload rate is defined for the entire link by setting the rs_ena bit in the rttptcrc[0] register and by configuring rttptcrc[0] and rttptcrm[0] registers. 3.5.6 loopback support 3.5.6.1 general the 82576 supports the following types of internal loopback in the lan interfaces: ? mac loopback (point 1 in figure) ? internal phy loopback (point 2 in figure) ? internal serdes loopback (point 3 in figure) ? external phy loopback (point 4 in figure) functionality for mac loopback is tested using phy loopback on this device. use phy loopback instead of mac loopback on the 82576. for more information on loopback, contact your intel representative for access to the intel? ethernet controllers loopback modes document . by setting the device to loopback mode, packets that are transmitted towards the line will be looped back to the host. the 82576 is fully functional in these modes, just not transmitting data over the lines. figure 3-5 shows the points of loopback. for more details on the usage and loopback test setup - see intel? ethernet controllers loopback modes application note.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 138 3.5.6.2 mac loopback mac loopback is not used on this device. 3.5.6.3 internal phy loopback in internal phy loopback the serdes block is not functional and data is looped back at the end of the phy functionality. this means all the design that is functional in copper mode, is involved in the loopback 3.5.6.3.1 setting the 82576 to phy loopback mode the following procedure should be used to put the 82576 in phy loopback mode: ? set link mode to phy: ctrl_ext.link_mode (csr 0x18 bits 23:22) = 0b00 ? in phy control register (address 0 in the phy): ? set duplex mode (bit 8) ? set loopback bit (bit 14) ? clear auto neg enable bit (bit 12) ? set speed using bits 6 and 13 as described in eas. ? register value should be: for 10 mbps 0x4100 for 100 mbps 0x6100 for 1000 mbps 0x4140. ? in port control register (address 16 (0x10) in the phy), set bit 14 (link disable). this is not a must for 1g but required for 10/100mbps while in loopback mode, polling for link might not return a valid link state. transmit and receive normally. note: make sure a configure command is re-issued (loopback bits set to 00b) to cancel the loopback mode. figure 3-5. intel? 82576 gbe controller loopback modes
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 139 3.5.6.4 serdes loopback in serdes loopback the phy block is not functional and data is looped back at the end of the serdes functionality. this means all the design that is functional in serdes/sgmii mode, is involved in the loopback. note: serdes loopback is functional only if the serdes link is up. 3.5.6.4.1 setting serdes loopback mode the following procedure should be used to put the 82576 in serdes loopback mode: ? set link mode to serdes: ctrl_ext.link_mode (csr 0x18 bits 23:22) = 0b11 ? configure serdes (register 4 bit 1) to loopback: write to serdesctl (csr 0x00024) the value 0x410 ? move to force mode by setting the following bits: ? ctrl.fd (csr 0x0 bit 0) = 1 ? ctrl.slu (csr 0x0 bit 6) = 1 ? ctrl.rfce (csr 0x0 bit 27) = 0 ? ctrl.tfce (csr 0x0 bit 28) = 0 ? ctrl.ilos (csr 0x0 bit 7) = 1 ? ctrl.lrst (csr 0x0 bit 3) = 0 ? pcs_lctl.force_link (csr 0x04208 bit 5) = 1 ? pcs_lctl.fsd (csr 0x04208 bit 4) = 1 ? pcs_lctl.fdv (csr 0x04208 bit 3) = 1 ? pcs_lctl.flv (csr 0x04208 bit 0) = 1 ? pcs_lctl.an_enable (csr 0x04208 bit 16) = 0 3.5.6.5 external phy loopback in external phy loopback the serdes block is not functional and data is sent through the mdi interface and looped back using an external loopback plug. this means all the design that is functional in copper mode, is involved in the loopback. 3.5.6.5.1 setting the 82576 to external phy loopback mode the following procedure should be used to put the 82576 in phy loopback mode: ? set link mode to phy: ctrl_ext.link_mode (csr 0x18 bits 23:22) = 0x0b00 ? in phy control register (address 0 in the phy): - write 0x0140 to: ? set duplex mode (bit 8) ? clear loopback bit (bit 14) ? clear auto neg enable bit (bit 12) ? force 1 gbps mode (set bit 6 and clear bit 13) ? force master mode by setting gcon phy register (address 9 in the phy) to 0x1a00 ? tune the phy dsp to loopback operation (in 1 gbps mode only) using the following sequence:
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 140 ? set phy register address 0x12 to 0x1610 ? enable loopback on twisted pair ? disable flip chip ? auto mdi-x ? turn off next cancellers using the following command: ? set phy register address 0x1f37 to 0x3f1c. the above procedure puts the device in phy loopback mode. after using the procedure, wait for link to become up. once phy register 1 bit 2 is set (this can take up to 750ms), transmit and receive normally. if you are unable to get link after 750ms, reset the phy using ctrl.phy_rst (see section 4.2.1.10 ) and then repeat the above procedure. when exiting external phy loopback mode, a full phy reset must be done. use ctrl.phy_rst (see section 4.2.1.10 ). 3.5.7 integrated copper phy functionality the phy default configuration is determined by data from the eeprom, which is read right after power- on reset. the register set used to control the phy functionality (phyreg) is described in section 8.25 . 3.5.7.1 phy initialization functionality 3.5.7.1.1 auto mdio register initialization the 82576 phy supports an option to automatically initialize mdio registers with values from eeprom/ rom if the hardware defaults are not adequate. in the 82576, this is performed by the mms unit (firmware). there are two types of register initialization: 1. general register initialization - any register in phy can be initialized. 2. eeprom bit initialization - there are some bits in the phy that are a mirror of eeprom bit - 25.6, 25.3:0, 26.0. after any phy reset (power down included), the phy needs to be initialized for both steps 1 and 2. the register initialization is done by the mms (firmware) through the mac/phy mdio interface (mdic). 3.5.7.1.2 general register initialization a block of data is allocated in eeprom/rom (see section 6.4 ). this block holds register addresses and data in mdic format ( section 8.2.4 ). every time a phy reset ends, this block is read from eeprom by the mms and is written to phy registers through the mdic registers and the mdio interface.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 141 3.5.7.1.3 mirror bit initialization there are a number of bits (oem bits) that reside in eeprom/mac control registers that have a mirror bit in the phy registers.these bits are also updated by the mms after every phy reset. these bits are updated after the general register initialization and through a read modify write sequence. the current mirror bits are: registers - 25.6, 25.3:0, and 26.0. 3.5.7.2 determining link state the phy and its link partner determine the type of link established through one of three methods: ? auto-negotiation ? parallel detection ? forced operation auto-negotiation is the only method allowed by the 802.3ab standard for establishing a 1000base-t link, although forced operation could be used for test purposes. for 10/100 links, any of the three methods can be used. the following sections discuss each in greater detail. figure 3-6 provides an overview of link establishment. first the phy checks if auto-negotiation is enabled. by default, the phy supports auto-negotiation, see phy register 0, bit 12. if not, the phy forces operation as directed. if auto-negotiation is enabled, the phy begins transmitting fast link pulses (flps) and receiving flps from its link partner. if flps are received by the phy, auto-negotiation proceeds. it also can receive 100base-tx mlt3 and 10base-t normal link pulses (nlps). if either mlt3 or nlps are received, it aborts flp transmission and immediately brings up the corresponding half-duplex link.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 142 3.5.7.2.1 false link the phy does not falsely establish link with a partner operating at a different speed. for example, the phy does not establish a 1 gb/s or 10 mb/s link with a 100 mb/s link partner. when the phy is first powered on, reset, or encounters a link down state, it must determine the line speed and operating conditions to use for the network link. the phy first checks the mdio registers (initialized via the hardware control interface or written by software) for operating instructions. using these mechanisms, designers can command the phy to do one of the following: ? force twisted-pair link operation to: ? 1000t, full duplex ? 1000t, half duplex ? 100tx, full duplex ? 100tx, half duplex ? 10base-t, full duplex ? 10base-t, half duplex ? allow auto-negotiation/parallel-detection. figure 3-6. overview of link establishment
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 143 in the first six cases (forced operation), the phy immediately begins operating the network interface as commanded. in the last case, the phy begins the auto-negotiation/parallel-detection process. 3.5.7.2.2 forced operation forced operation can be used to establish 10 mb/s and 100 mb/s links, and 1000 mb/s links for test purposes. in this method, auto-negotiation is disabled completely and the link state of the phy is determined by mii register 0. note: when speed is forced, the auto cross-over feature is not functional. in forced operation, the designer sets the link speed (10, 100, or 1000 mb/s) and duplex state (full or half). for gigabit (1000 mb/s) links, designers must explicitly designate one side as the master and the other as the slave. note: the paradox (per the standard): if one side of the link is forced to full-duplex operation and the other side has auto-negotiation enabled, the auto-negotiating partner parallel-detects to a half-duplex link while the forced side operates as directed in full-duplex mode. the result is spurious, unexpected collisions on the side configured to auto-negotiate. table 3-35 lists link establishment procedures. 3.5.7.2.3 auto negotiation the phy supports the ieee 802.3u auto-negotiation scheme with next page capability. next page exchange uses register 7 to send information and register 8 to receive them. next page exchange can only occur if both ends of the link advertise their ability to exchange next pages. 3.5.7.2.4 parallel detection parallel detection can only be used to establish 10 and 100 mb/s links. it occurs when the phy tries to negotiate (transmit flps to its link partner), but instead of sensing flps from the link partner, it senses 100base-tx mlt3 code or 10base-t normal link pulses (nlps) instead. in this case, the phy immediately stops auto-negotiation (terminates transmission of flps) and immediately brings up whatever link corresponds to what it has sensed (mlt3 or nlps). if the phy senses both technologies, the parallel detection fault is detected and the phy continues sending flps. with parallel detection, it is impossible to determine the true duplex state of the link partner and the ieee standard requires the phy to assume a half-duplex link. parallel detection also does not allow exchange of flow-control ability (pause and asm_dir) or the master/slave relationship required by 1000base-t. this is why parallel detection cannot be used to establish gbe links. table 3-35. determining duplex state via parallel detection configuration result both sides set for auto-negotiate link is established via auto-negotiation. both sides set for forced operation no problem as long as duplex settings match. one side set for auto-negotiation and the other for forced, half-duplex link is established via parallel detect. one side set for auto-negotiation and the other for forced full- duplex link is established; however, sides disagree, resulting in transmission problems (forced side is full-duplex, auto- negotiation side is half-duplex.).
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 144 3.5.7.2.5 auto cross-over twisted pair ethernet phy's must be correctly configured for mdi or mdi-x operation to inter operate. this has historically been accomplished using special patch cables, magnetics pinouts or printed circuit board (pcb) wiring. the phy supports the automatic mdi/mdi-x configuration originally developed for 1000base-t and standardized in ieee 802.3u section 40. manual (non-automatic) configuration is still possible. for 1000base-t links, pair identification is determined automatically in accordance with the standard. for 10/100 mb/s inks and during auto-negotiation, pair usage is determined by bits 12 and 13 in the port control register (phyreg18). in addition, the phy has an automatic cross-over detection function. if bit 18.12 = 1b, the phy automatically detects which application is being used and configures itself accordingly. the automatic mdi/mdi-x state machine facilitates switching the mdi_plus[0] and mdi_minus[0] signals with the mdi_plus[1] and mdi_minus[1] signals, respectively, prior to the auto-negotiation mode of operation so that flps can be transmitted and received in compliance with clause 28 auto- negotiation specifications. an algorithm that controls the switching function determines the correct polarization of the cross-over circuit. this algorithm uses an 11-bit linear feedback shift register (lfsr) to create a pseudo-random sequence that each end of the link uses to determine its proposed configuration. after making the selection to either mdi or mdi-x, the node waits for a specified amount of time while evaluating its receive channel to determine whether the other end of the link is sending link pulses or phy-dependent data. if link pulses or phy-dependent data are detected, it remains in that configuration. if link pulses or phy-dependent data are not detected, it increments its lfsr and makes a decision to switch based on the value of the next bit. the state machine does not move from one state to another while link pulses are being transmitted. 3.5.7.2.6 10/100 mb/s mismatch resolution it is a common occurrence that a link partner (such as a switch) is configured for forced full-duplex 10/ 100 mb/s operation. the normal auto-negotiation sequence would result in the other end settling for half-duplex 10/100 mb/s operation. the mechanism described in this section resolves the mismatch and automatically transitions the 82576 into fdx mode, enabling it to operate with a partner configured for fdx operation. figure 3-7. cross-over function
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 145 the 82576 enables the system software device driver to detect the mismatch event previously described and sets its duplex mode to the appropriate value without a need to go through another auto-negotiation sequence or breaking link. once software detects a possible mismatch, it might instruct the 82576 to change its duplex setting to either hdx or fdx mode. software sets the duplex_manual_set bit to indicate that duplex setting should be changed to the value indicated by the duplex mode bit in phy register 0. any change in the value of the duplex mode bit in phy register 0 while the duplex_manual_set bit is set to 1b would also cause a change in the device duplex setting. the duplex_manual_set bit is cleared on all phy resets, following auto-negotiation, and when the link goes down. software might track the change in duplex through the phy duplex mode bit in register 17 or a mac indication. 3.5.7.2.7 link criteria once the link state is determined-via auto-negotiation, parallel detection or forced operation, the phy and its link partner bring up the link. 3.5.7.2.7.1 1000base-t for 1000base-t links, the phy and its link partner enter a training phase. they exchange idle symbols and use the information gained to set their adaptive filter coefficients. these coefficients are used to equalize the incoming signal, as well as eliminate signal impairments such as echo and cross talk. either side indicates completion of the training phase to its link partner by changing the encoding of the idle symbols it transmits. when both sides so indicate, the link is up. each side continues sending idle symbols each time it has no data to transmit. the link is maintained as long as valid idle, data, or carrier extension symbols are received. 3.5.7.2.7.2 100base-tx for 100base-tx links, the phy and its link partner immediately begin transmitting idle symbols. each side continues sending idle symbols each time it has no data to transmit. the link is maintained as long as valid idle symbols or data is received. in 100 mb/s mode, the phy establishes a link each time the scrambler becomes locked and remains locked for approximately 50 ms. link remains up unless the de scrambler receives less than 12 consecutive idle symbols in any 2 ms period. this provides for a very robust operation, essentially filtering out any small noise hits that might otherwise disrupt the link. 3.5.7.2.7.3 10base-t for 10base-t links, the phy and its link partner begin exchanging normal link pulses (nlps). the phy transmits an nlp every 16 ms and expects to receive one every 10 to 20 ms. the link is maintained as long as normal link pulses are received. in 10 mb/s mode, the phy establishes link based on the link state machine found in 802.3, clause 14. note: 100 mb/s idle patterns do not bring up a 10 mb/s link. 3.5.7.3 link enhancements the phy offers two enhanced link functions, each of which are discussed in the sections that follow: ? smartspeed
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 146 ? flow control 3.5.7.3.1 smartspeed smartspeed is an enhancement to auto-negotiation that enables the phy to react intelligently to network conditions that prohibit establishment of a 1000base-t link, such as cable problems. such problems might allow auto-negotiation to complete, but then inhibit completion of the training phase. normally, if a 1000base-t link fails, the phy returns to the auto-negotiation state with the same speed settings indefinitely. with smartspeed enabled, after a configurable number (1-5, register 27.8:6) of failed attempts, the phy automatically downgrades the highest ability it advertises to the next lower speed: from 1000 to 100 to 10 mb/s. once a link is established, and if it is later broken, the phy automatically upgrades the capabilities advertised to the original setting. this enables the phy to automatically recover once the cable plant is repaired. 3.5.7.3.1.1 using smartspeed smartspeed is enabled by setting phyreg.16.7 = 1b. when smartspeed downgrades the phy advertised capabilities, it sets bit phyreg.19.5. when link is established, its speed is indicated in phyreg.17.15:14. smartspeed automatically resets the highest-level auto-negotiation abilities advertised, if link is established and then lost for more than 2 seconds. the number of failed attempts allowed is configured by register 27.8:6. note: smartspeed and m/s fault - when smartspeed is enabled, the m/s (master-slave) resolution is not given seven attempts to try to resolve m/s status (see ieee 802.3 clause 40.5.2), this is due to the fact that smartspeed downgrades the link after at most five attempts. time to link with smart speed - in most cases, any attempt duration is approximately 2.5 seconds, in other cases it could take more than 2.5 seconds depending on configuration and other factors. 3.5.7.4 flow control flow control is a function that is described in clause 31 of the ieee 802.3 standard. it allows congested nodes to pause traffic. flow control is essentially a mac-to-mac function. macs indicate their ability to implement flow control during auto-negotiation. this ability is communicated through two bits in the auto-negotiation registers (phyreg.4.10 and phyreg.4.11). the phy transparently supports mac-to-mac advertisement of flow control through its auto-negotiation process. prior to auto-negotiation, the mac indicates its flow control capabilities via phyreg.4.10 (pause) and phyreg.4.11 (asm_dir). after auto-negotiation, the link partner's flow control capabilities are indicated in phyreg.5.10 and phyreg.5.11. there are two forms of flow control that can be established via auto-negotiation: symmetric and asymmetric. symmetric flow control is for point-to-point links; asymmetric for hub-to-end-node connections. symmetric flow control enables either node to flow-control the other. asymmetric flow- control enables a repeater or switch to flow-control a dte, but not vice versa. table 3-36 lists the intended operation for the various settings of asm_dir and pause. this information is provided for reference only; it is the responsibility of the mac to implement the correct function. the phy merely enables the two macs to communicate their abilities to each other.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 147 3.5.7.5 management data interface the phy supports the ieee 802.3 mii management interface also known as the management data input/output (mdio) interface. this interface enables upper-layer devices to monitor and control the state of the phy. the mdio interface consists of a physical connection, a specific protocol that runs across the connection, and an internal set of addressable registers. the phy supports the core 16-bit mdio registers. registers 0-10 and 15 are required and their functions are specified by the ieee 802.3 specification. additional registers are included for expanded functionality. specific bits in the registers are referenced using an phy reg x.y notation, where x is the register number (0-31) and y is the bit number (0-15). see the software interface chapter. 3.5.7.6 low power operation and power management the phy incorporates numerous features to maintain the lowest power possible. the phy can be entered into a low-power state according to mac control (power management controls) or via phy register 0. in either power down mode, the phy is not capable of receiving or transmitting packets. 3.5.7.6.1 power down via the phy register the phy can be powered down using the control bit found in phyreg.0.11. this bit powers down a significant portion of the port but clocks to the register section remain active. this enables the phy management interface to remain active during register power down. the power down bit is active high. when the phy exits software power-down (phyreg.0.11 = 0b), it re-initializes all analog functions, but retains its previous configuration settings. 3.5.7.6.2 power management state phy is aware of power management state. if the phy is not in a power down state, then phy behavior regarding several features are different depending on the power state. see section 3.5.7.6.4 for details. 3.5.7.6.3 an1000_dis table 3-36. pause and asymmetric pause settings asm_dir settings local (phyreg.4.10) and remote (phyreg.5.10) pause setting - local (phyreg.4.9) pause setting - remote (phyreg.5.9) result both asm_dir = 1b 1 1 symmetric - either side can flow control the other 1 0 asymmetric - remote can flow control local only 0 1 asymmetric - local can flow control remote 0 0 no flow control either or both asm_dir = 0b 1 1 symmetric - either side can flow control the other either or both = 0 no flow control
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 148 an1000_dis is an option to disable 1000 mb/s advertisement in phy regardless of register 9. this is for cases where the system doesn't support working in 1000 mb/s due to power limitations. this option is enabled by following bits in phy registers: ? phyreg 25.3 - disable 1000 mb/s when in non-d0a states only. ? phyreg 25.6 - disable 1000 mb/s always. ? phyreg 26.0 - same as 25.6, but this is a secure bit (see secure register chapter). 3.5.7.6.4 low power link up - link speed control normal phy speed negotiation drives to establish a link at the highest possible speed. the phy supports an additional mode of operation, where the phy drives to establish a link at a low speed. the link-up process enables a link to come up at the lowest possible speed in cases where power is more important than performance. different behavior is defined for the d0 state and the other non-d0 states. note: the low-power link-up (lplu) feature previously described should be disabled (in both d0a state and non-d0a states) when the designer advertisement is anything other than 10/100/ 1000 mb/s (all three). this is to avoid reaching (through the lplu procedure) a link speed that is not advertised by the user. table 3-37 lists link speed as function of power management state, link speed control, and gbe speed enabling: the phy initiates auto-negotiation without a direct driver command in the following cases: ? when the state of disable_1000 changes. for example, if 1000 mb/s is disabled on d3 or dr entry (but not in d0a), the phy auto-negotiates on entry. table 3-37. link speed vs. power state power management state low power link up (reg 25.1 and 2) gbe disable bits phy speed negotiation disable 1000 (reg 25.6) disable 1000 in non-d0a (reg 25.3) d0a 0, xb 0b x phy negotiates to highest speed advertised (normal operation). 1b phy negotiates to highest speed advertised (normal operation), excluding 1000 mb/s. 1, xb 0b x phy goes through low power link up (lplu) procedure, starting with advertised values. 1b phy goes through lplu procedure, starting with advertised values. does not advertise 1000 mb/s. non-d0a x, 0b 0b 0b phy negotiates to highest speed advertised. 0b 1b phy negotiates to highest speed advertised, excluding 1000 mb/s. 1b x x, 1b 0b 0b phy goes through lplu procedure, starting at 10 mb/ s. 0b 1b phy goes through lplu procedure, starting at 10 mb/ s. does not advertise 1000 mb/s.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 149 ? when lplu changes state with a change in a power management state. for example, on transition from d0a without lplu to d3 with lplu. or, on transition from d3 with lplu to d0 without lplu. ? on a transition from d0a state to a non-d0a state, or from a non-d0a state to d0a state, and lplu is set. 3.5.7.6.4.1 d0a state a power-managed link speed control lowers link speed (and power) when highest link performance is not required. when enabled (d0 low power link up mode), any link negotiation tries to establish a low- link speed, starting with an initial advertisement defined by software. the d0lplu configuration bit enables d0 low power link up . before enabling this feature, software must advertise to one of the following speed combinations: 10 mb/s only, 10/100 mb/s only, or 10/100/ 1000 mb/s. when speed negotiation starts, the phy tries to negotiate at a speed based on the currently advertised values. if link establishment fails, the phy tries to negotiate with different speeds; it enables all speeds up to the lowest speed supported by the partner. for example, phy advertises 10 mb/s only, and the partner supports 1000 mb/s only. after the first try fails, the phy enables 10/100/1000 mb/s and tries again. the phy continues to try and establish a link until it succeeds or until it is instructed otherwise. in the second step (adjusting to partner speed), the phy also enables parallel detect, if needed. automatic mdi/mdi-x resolution is done during the first auto-negotiation stage. 3.5.7.6.4.2 non-d0a state the phy might negotiate to a low speed while in non-d0a states (dr, d0u, d3). this applies only when the link is required by one of the following: smbus manageability, apm wake, or pme. otherwise, the phy is disabled during the non-d0 state. the low power on link-up (register 25.2, is also loaded from eeprom) bit enables reduction in link speed: ? at power-up entry to dr state, the phy advertises supports for 10 mb/s only and goes through the link up process. ? at any entry to a non-d0a state (dr, d0u, d3), the phy advertises support for 10 mb/s only and goes through the link up process. ? while in a non-d0 state, if auto-negotiation is required, the phy advertises support for 10 mb/s only and goes through the link up process. link negotiation begins with the phy trying to negotiate at 10 mb/s speed only regardless of user auto- negotiation advertisement. if link establishment fails, the phy tries to negotiate at additional speeds; it enables all speeds up to the lowest speed supported by the partner. for example, the phy advertises 10 mb/s only and the partner supports 1000 mb/s only. after the first try fails, phy enables 10/100/ 1000 mb/s and tries again. the phy continues to try and establish a link until it succeeds or until it is instructed otherwise. in the second step (adjusting to partner speed), the phy also enables parallel detect, if needed. automatic mdi/mdi-x resolution is done during the first auto-negotiation stage. 3.5.7.6.5 smart power-down (spd) smart power-down is a link-disconnect capability applicable to all power management states. spd combines a power saving mechanism with the fact that the link might disappear and resume.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 150 smart power-down is enabled by phyreg 25.0 or by spd enable bit in the eeprom and is entered when the phy detects link loss. auto-negotiation must also be enabled. while in the smart power-down state, the phy powers down circuits and clocks that are not required for detection of link activity. the phy is still be able to detect link pulses (including parallel detect) and wake-up to engage in link negotiation. the phy does not send link pulses (nlp) while in spd state; however, register accesses are still possible. when the phy is in smart power-down and detects link activity, it re-negotiates link speed based on the power state and the low power link up bit as described in phyreg 25.1 and 25.2. note: the link-disconnect state applies to all power management states (dr, d0u, d0a, d3). the link might change status, that is go up or go down, while in any of these states. 3.5.7.6.5.1 back-to-back smart power-down while in link disconnect, the 82576 monitors the link for link pulses to identify when a link is re- connected. the 82576 also periodically transmits pulses to resolve the case of two the 82576s (or devices with the 82576-like behavior) connected to each other across the link. otherwise, two such devices might be locked in smart power-down mode, not capable of identifying that a link was re- connected. the link pulses are transmitted on average every 100 ms on alternate channels (a/b and c/d) and add <1% to total the 82576 power in link disconnect mode. pulses do not conform to ieee specification regarding link pulse template. a single pulse should be enough to bring a receiver out of smart power- down mode in a worst-case configuration (such as maximum cable length, highest cable attenuation, etc.). if the link partners are disconnected and then reconnected, it is possible that the two controllers transmit their pulses at the same time. since the 82576 masks its receiver during pulse transmission, such synchronization causes pulses to be missed by both partners. a randomization factor is therefore applied to the timing of transmitted pulses, affecting the period between pulses. the randomization factor is specific per device and should reduce the probability of a lock to 10 -4 . note that if the two partners happen to transmit within the same slot, and if the randomization factor happens to be similar, it takes longer for the partners to get out of sync with each other. back-to-back smart power-down is enabled by the spd_b2b_en bit in the phy registers. the default value is enabled. the enable bit applies to smart power-down mode. note: this bit should not be altered by software once the 82576 was set in smart power-down mode. if software requires changing the back-to-back status, it first needs to transition the phy out of smart power-down mode and only then change the back-to-back bit to the required state. 3.5.7.6.6 link energy detect the phy asserts the link energy detect bit (phyreg 25.4) each time energy is detected on the link. this bit provides an indication of a cable becoming plugged or unplugged. this bit is valid only if auto-negotiation is enabled and smart power-down is enabled (reg 25.0). in order to correctly deduce that there is no energy, the bit must read 0b for three consecutive reads each second. 3.5.7.6.7 phy power-down state
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 151 each 82576 port enters a power-down state when none of its clients is enabled and therefore has no need to maintain a link. this can happen in one of the following cases. note that phy power-down must be enabled through the eeprom phy power down enable bit. 1. d3/dr state: each phy enters a low-power state if the following conditions are met: a. the lan function associated with this phy is in a non-d0 state b. apm wol is inactive c. manageability doesn't use this port. d. acpi pme is disabled for this port. e. the phy power down enable eeprom bit is set (word 0xf, bit 6). 2. serdes mode: each phy is disabled when its lan function is configured to serdes mode. 3. lan disable: each phy can be disabled if its lan function's lan disable input indicates that the relevant function should be disabled. since the phy is shared between the lan function and manageability, it might not be desirable to power down the phy in lan disable. the phy_in_lan_disable eeprom bit determines whether the phy (and mac) are powered down when the lan disable pin is asserted. the default is not to power down. a lan port can also be disabled through eeprom settings. if the lan_dis eeprom bit is set, the phy enters power down. note, however, that setting the eeprom lan_pci_dis bit does not bring the phy into power down. 3.5.7.7 advanced diagnostics the 82576 phy incorporates hardware support for advanced diagnostics. the hardware support enables output of internal phy data to host memory for post processing by the software device driver. diagnostics supported are: 3.5.7.7.1 tdr - time domain reflectometry by sending a pulse onto the twisted pair and observing the retuned signal, the following can be deduced: 1. is there a short? 2. is there an open? 3. is there an impedance mismatch? 4. what is the length to any of these faults? 3.5.7.7.2 channel frequency response by doing analysis on the tx and rx data, it can be established that a channel?s frequency response (also known as insertion loss) can determine if the channel is within specification limits. (clause 40.7.2.1 in ieee 802.3). 3.5.7.8 1000 mb/s operation 3.5.7.8.1 introduction
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 152 figure 3-8 shows an overview of 1000base-t functions, followed by discussion and review of the internal functional blocks. 3.5.7.8.2 transmit functions this section describes functions used when the media access controller (mac) transmits data through the phy and out onto the twisted-pair connection (see figure 3-8 ). 3.5.7.8.2.1 scrambler the scrambler randomizes the transmitted data. the purpose of scrambling is twofold: 1. scrambling eliminates repeating data patterns (also known as spectral lines) from the 4dpam5 waveform in order to reduce emi. 2. each channel (a, b, c, d) has a unique signature that the receiver uses for identification. figure 3-8. 1000base-t functions overview
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 153 the scrambler is driven by a 33-bit linear feedback shift register (lfsr), which is randomly loaded at power up. the lfsr function used by the master differs from that used by the slave, giving each direction its own unique signature. the lfsr, in turn, generates twelve mutually uncorrelated outputs. eight of these are used to randomize the inputs to the 4dpam5 and trellis encoders. the remaining four outputs randomize the sign of the 4dpam5 outputs. 3.5.7.8.2.2 transmit fifo the transmit fifo re-synchronizes data transmitted by the mac to the transmit reference used by the phy. the fifo is large enough to support a frequency differential of up to +/- 1000 ppm over a packet size of 9500 bytes (max jumbo frame). 3.5.7.8.2.3 transmit phase-locked loop pll this function generates the 125 mhz timing reference used by the phy to transmit 4dpam5 symbols. when the phy is the master side of the link, the xi input is the reference for the transmit pll. when the phy is the slave side of the link, the recovered receive clock is the reference for the transmit pll. 3.5.7.8.2.4 trellis encoder the trellis encoder uses the two high-order bits of data and its previous output to generate a ninth bit, which determines if the next 4dpam5 pattern should be even or odd. for data, this function is: trellisn = data7n-1 xor data6n-2 xor trellisn-3 this provides forward error correction and enhances the signal-to-noise (snr) ratio by a factor of 6 db. 3.5.7.8.2.5 4dpam5 encoder the 4dpam5 encoder translates 8-byte codes transmitted by the mac into 4dpam5 symbols. the encoder operates at 125 mhz, which is both the frequency of the mac interface and the baud rate used by 1000base-t. each 8-byte code represents one of 28 or 256 data patterns. each 4dpam5 symbol consists of one of five signal levels (-2,-1,0,1,2) on each of the four twisted pair (a,b,c,d) representing 54 or 625 possible patterns per baud period. of these, 113 patterns are reserved for control codes, leaving 512 patterns for data. these data patterns are divided into two groups of 256 even and 256 odd data patterns. thus, each 8-byte octet has two possible 4dpam5 representations: one even and one odd pattern. 3.5.7.8.2.6 spectral shaper this function causes the 4dpam5 waveform to have a spectral signature that is very close to that of the mlt3 waveform used by 100base-tx. this enables 1000base-t to take advantage of infrastructure (cables, magnetics) designed for 100base-tx. the shaper works by transmitting 75% of a 4dpam5 code in the current baud period, and adding the remaining 25% into the next baud period.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 154 3.5.7.8.2.7 low-pass filter to aid with emi, this filter attenuates signal components more than 180 mhz. in 1000base-t, the fundamental symbol rate is 125 mhz. 3.5.7.8.2.8 line driver the line driver drives the 4dpam5 waveforms onto the four twisted-pair channels (a, b, c, d), adding them onto the waveforms that are simultaneously being received from the link partner. 3.5.7.8.3 receive functions this section describes function blocks that are used when the phy receives data from the twisted pair interface and passes it back to the mac (see figure 3-10 ). figure 3-9. 1000base-t transmit flow and line coding scheme figure 3-10. transmit/receive flow
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 155 3.5.7.8.3.1 hybrid the hybrid subtracts the transmitted signal from the input signal, enabling the use of simple 100base- tx compatible magnetics. 3.5.7.8.3.2 automatic gain control (agc) agc normalizes the amplitude of the received signal, adjusting for the attenuation produced by the cable. 3.5.7.8.3.3 timing recovery this function re-generates a receive clock from the incoming data stream which is used to sample the data. on the slave side of the link, this clock is also used to drive the transmitter. 3.5.7.8.3.4 analog-to-digital converter (adc) the adc function converts the incoming data stream from an analog waveform to digitized samples for processing by the dsp core. 3.5.7.8.3.5 digital signal processor (dsp) dsp provides per-channel adaptive filtering, which eliminates various signal impairments including: ? inter-symbol interference (equalization) ? echo caused by impedance mismatch of the cable ? near-end crosstalk (next) between adjacent channels (a, b, c, d) ? far-end crosstalk (fext) ? propagation delay variations between channels of up to 120 ns ? extraneous tones that have been coupled into the receive path the adaptive filter coefficients are initially set during the training phase. they are continuously adjusted (adaptive equalization) during operation through the decision-feedback loop. 3.5.7.8.3.6 de scrambler the de scrambler identifies each channel by its characteristic signature, removing the signature and re- routing the channel internally. in this way, the receiver can correct for channel swaps and polarity reversals. the de scrambler uses the same base 33-bit lfsr used by the transmitter on the other side of the link. the de scrambler automatically loads the seed value from the incoming stream of scrambled idle symbols. the de scrambler requires approximately 15 ? s to lock, normally accomplished during the training phase. 3.5.7.8.3.7 viterbi decoder/decision feedback equalizer (dfe) the viterbi decoder generates clean 4dpam5 symbols from the output of the dsp. the decoder includes a trellis encoder identical to the one used by the transmitter. the viterbi decoder simultaneously looks at the received data over several baud periods. for each baud period, it predicts whether the symbol received should be even or odd, and compares that to the actual symbol received. the 4dpam5 code is
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 156 organized in such a way that a single level error on any channel changes an even code to an odd one and vice versa. in this way, the viterbi decoder can detect single-level coding errors, effectively improving the signal-to-noise (snr) ratio by a factor of 6 db. when an error occurs, this information is quickly fed back into the equalizer to prevent future errors. 3.5.7.8.3.8 4dpam5 decoder the 4dpam5 decoder generates 8-byte data from the output of the viterbi decoder. 3.5.7.8.3.9 100 mb/s operation the mac passes data to the phy over the mii. the phy encodes and scrambles the data, then transmits it using mlt-3 for 100tx over copper. the phy de scrambles and decodes mlt-3 data received from the network. when the mac is not actively transmitting data, the phy sends out idle symbols on the line. 3.5.7.8.3.10 10 mb/s operation the phy operates as a standard 10 mb/s transceiver. data transmitted by the mac as 4-bit nibbles is serialized, manchester-encoded, and transmitted on the mdi[0]+/- outputs. received data is decoded, de-serialized into 4-bit nibbles and passed to the mac across the internal mii. the phy supports all the standard 10 mb/s functions. 3.5.7.8.3.11 link test in 10 mb/s mode, the phy always transmits link pulses. if link test function is enabled, it monitors the connection for link pulses. once it detects two to seven link pulses, data transmission are enabled and remain enabled as long as the link pulses or data reception continues. if the link pulses stop, the data transmission is disabled. if the link test function is disabled, the phy might transmit packets regardless of detected link pulses. setting the port configuration register bit (phyreg.16.14) can disable the link test function. 3.5.7.8.3.12 10base-t link failure criteria and override link failure occurs if link test is enabled and link pulses stop being received. if this condition occurs, the phy returns to the auto-negotiation phase, if auto-negotiation is enabled. setting the port configuration register bit (phyreg.16.14) disables the link integrity test function, then the phy transmits packets, regardless of link status. 3.5.7.8.3.13 jabber if the mac begins a transmission that exceeds the jabber timer, the phy disables the transmit and loopback functions and asserts collision indication to the mac. the phy automatically exits jabber mode after 250-750 ms. this function can be disabled by setting bit phyreg.16.10 = 1b. 3.5.7.8.3.14 polarity correction the phy automatically detects and corrects for the condition where the receive signal (mdi_plus[0]/ mdi_minus[0]) is inverted. reversed polarity is detected if eight inverted link pulses or four inverted end-of-frame markers are received consecutively. if link pulses or data are not received for 96-130 ms, the polarity state is reset to a non-inverted state.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 157 automatic polarity correction can be disabled by setting bit phyreg.27.5. 3.5.7.8.3.15 dribble bits the phy handles dribble bits for all of its modes. if between one and four dribble bits are received, the nibble is passed across the interface. the data passed across is padded with 1's if necessary. if between five and seven dribble bits are received, the second nibble is not sent onto the internal mii bus to the mac. this ensures that dribble bits between 1-7 do not cause the mac to discard the frame due to a crc error. 3.5.7.8.3.16 phy address the phy address for mdio accesses is 00001b. 3.5.8 media auto sense the 82576 provides a significant amount of flexibility in pairing a lan device with a particular type of media (such as copper or fiber-optic) as well as the specific transceiver/interface used to communicate with the media. each mac, representing a distinct lan device, can be coupled with an internal copper phy (the default) or serdes/sgmii interface independently. the link configuration specified for each lan device can be specified in the link_mode field of the extended device control (ctrl_ext) register and initialized from the eeprom initialization control word 3 associated with each lan device. in some applications, software might need to be aware of the presence of a link on the media not currently active. in order to supply such an indication, any of the 82576 ports can set the autosense_en bit in the connsw register (address 0x00034) in order to enable sensing of the non active media activity. note: when in serdes/sgmii detect mode, software should define which indication is used to detect the energy change on the serdes/sgmii media. it can be either the external signal detect pin or the internal signal detect. this is done using the connsw.enrgsrc bit. the signal detect pin is normally used when connecting in serdes mode to optical media where the receive led provide such an indication. software can then enable the omed interrupt in icr in order to get an indication on any detection of energy in the non active media. note: the auto-sense capability can be used in either port independent of the usage of the other port. the following sections describes the procedures that should be followed in order to enable the auto- sense mode 3.5.8.1 auto sense setup 3.5.8.1.1 serdes/sgmii detect mode (phy is active) 1. set connsw.enrgsrc to determine the sources for the signal detect indication (1b = external sig_det, 0b = internal serdes electrical idle). the default of this bit is set by eeprom. 2. set connsw.autosense_en . 3. when link is detected on the serdes /sgmii media, the 82576 sets the interrupt bit omed in icr and if enabled, issues an interrupt. the connsw.autosense_en bit is cleared .
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 158 3.5.8.1.2 phy detect mode (serdes/sgmii is active) 1. set connsw.autosense_conf = 1b. 2. reset the phy as described in section 4.2 . 3. place the phy into link-disconnect mode by setting phy_reg 25.5 using the mdic register. 4. set connsw.autosense_en = 1b and then clear connsw.autosense_conf . 5. when signal is detected on the phy media, the 82576 sets the interrupt bit omed in icr and if enabled, issues an interrupt. 6. the 82576 puts the phy in power down mode. according to the result of the interrupt, software can then decide to switch to the other media. 3.5.8.2 switching between media the 82576's link mode is controlled by the extended device control register; ctrl_ext (0x00018) bits 23:22. the default value for the link_mode setting is directly mapped from the eeprom's initialization control word 3 (bits 1:0). software can modify the link_mode indication by writing the corresponding value into this register. note: before dynamically switching between medias, the software should ensure that the current mode of operation is not in the process of transmitting or receiving data. this is achieved by disabling the transmitter and receiver, waiting until the 82576 is in an idle state, and then beginning the process for changing the link mode. the mode switch in this method is only valid until the next hardware reset of the 82576. after a hardware reset, the link mode is restored to the default setting by the eeprom. to get a permanent change of the link mode, the default in the eeprom should be changed. the following procedures need to be followed to actually switch between the two modes. 3.5.8.2.1 transition to serdes/sgmii mode 1. disable the receiver by clearing rctl.rxen. 2. disable the transmitter by clearing tctl.en. 3. ensure smart power down is not enabled in the phy. eeprom word 0xf bit 1 or phy register 25d bit 0. 4. verify the has stopped processing outstanding cycles and is idle. 5. set ctrl.speed=10, ctrl.frcspd=1, ctrl_ext.spd_byps=1. 6. modify link mode to serdes or sgmii by setting ctrl_ext.link_mode to 11b or 10b, respectively. 7. delay a minimum of 10-20us 8. clear ctrl.frcspd, ctrl_ext.spd_byps 9. set up the link as described in section 4.5.7.3 or section 4.5.7.4 . 10. set up tx and rx queues and enable tx and rx processes.
interconnects ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 159 3.5.8.2.2 transition to internal phy mode 1. disable the receiver by clearing rctl.rxen . 2. disable the transmitter by clearing tctl.en . 3. verify the 82576 has stopped processing outstanding cycles and is idle. 4. modify link mode to phy mode by setting ctrl_ext.link_mode to 00b. 5. set link-up indication by setting ctrl.slu . 6. reset the phy as described in section 4.2 . 7. set up the link as described in section 4.5.7.4 . 8. set up the tx and rx queues and enable the tx and rx processes.
intel ? 82576eb gbe controller ? interconnects intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 160 note: this page intentionally left blank.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 161 4.0 initialization 4.1 power up 4.1.1 power-up sequence figure 4-1 shows the 82576 power-up sequence from power ramp up and until the device is ready to accept host commands. figure 4-1. 82576 power-up - general flow
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 162 note: the keep_phy_link_up bit ( veto bit) can be set by firmware when the mc is running ider or sol. its purpose is to prevent interruption of these processes when power is being turned on. 4.1.2 power-up timing diagram figure 4-2. power-up timing diagram table 4-1. notes to power-up timing diagram note 1 xosc is stable t xog after the power is stable 2 internal reset is released after all power supplies are good and t ppg after xosc is stable. 3 an nvm read starts on the rising edge of the internal reset. 4 after reading the nvm, phy might exit power down mode. 5 apm wakeup and/or manageability might be enabled based on nvm contents. 6 the pcie reference clock is valid t pe_rst-clk before the de-assertion of pe_rst# (according to pcie spec). 7 pe_rst# is de-asserted t pvpgl after power is stable (according to pcie spec). 8 de-assertion of pe_rst# causes the nvm to be re-read, asserts phy power-down (except if veto bit also known as keep_phy_link_up bit is set), and disables wake up. 9 after reading the nvm, phy exits power-down mode. 10 link training starts after t pgtrn from pe_rst# de-assertion.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 163 4.1.2.1 timing requirements the 82576 requires the following start-up and power state transitions. 4.1.2.2 timing guarantees the 82576 guarantees the following start-up and power state transition related timing parameters. 4.2 reset operation 4.2.1 reset sources the 82576 reset sources are described below: 11 a first pcie configuration access might arrive after t pgcfg from pe_rst# de-assertion. 12 a first pci configuration response can be sent after tpgres from pe_rst# de-assertion 13 writing a 1 to the memory access enable bit in the pci command register transitions the device from d0u to d0 state. table 4-2. power-up timing requirements parameter description min. max. notes t xog base 25 clock stable from power stable 10msec t pwrgd-clk pcie clock valid to pcie power good 100 ? s - according to pcie spec t pvpgl power rails stable to pcie reset inactive 100ms - according to pcie spec t pgcfg external pcie reset signal to first configuration cycle. 100ms according to pcie spec table 4-3. power-up timing guarantees parameter description min. max. notes t xog xosc stable from power stable 10msec t ppg internal power good delay from valid power rail 35msec use internal counter for external devices stabilization t ee eeprom read duration 20msec actual time depends on the eeprom content t opll pcie reset to start of link training 10msec t pcipll pcie reset to first configuration cycle 5msec t pgtrn pcie reset to start of link training 20msec according to pcie spec t pgres pcie reset to first configuration cycle 100mse c according to pcie spec table 4-1. notes to power-up timing diagram (continued)
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 164 4.2.1.1 internal_power_on_reset the 82576 has an internal mechanism for sensing the power pins. once the power is up and stable the 82576 creates an internal reset, this reset acts as a master reset of the entire chip. it is level sensitive, and while it is zero holds all of the registers in reset. internal_power_on_reset is interpreted to be an indication that device power supplies are all stable. internal_power_on_reset changes state during system power-up. 4.2.1.2 pe_rst_n the assertion of pe_rst_n indicates that both the power and the pcie clock sources are stable. this pin asserts an internal reset also after a d3cold exit. most units are reset on the rising edge of pe_rst_n. the only exception is the gio unit, which is kept in reset while pe_rst_n is de-asserted (level). 4.2.1.3 in-band pcie reset the 82576 generates an internal reset in response to a physical layer message from the pcie or when the pcie link goes down (entry to polling or detect state). this reset is equivalent to pci reset in previous (pci) gigabit lan controllers. 4.2.1.4 d3hot to d0 transition this is also known as acpi reset. the 82576 generates an internal reset on the transition from d3hot power state to d0 (caused after configuration writes from d3 to d0 power state). note that this reset is per function and resets only the function that transitions from d3hot to d0. 4.2.1.5 function level reset (flr) the flr bit is required for the pf and per vf (virtual function). setting of this bit for a vf resets only the part of the logic dedicated to the specific vf and does not influence the shared part of the port. setting the pf flr bit resets the entire function. 4.2.1.5.1 pf (physical function) flr or flr in non-iov mode an flr reset to a function is equivalent to a d0 ? d3 ? d0 transition with the exception that this reset does not require driver intervention in order to stop the master transactions of this function. in an iov enabled system, this reset resets all the vfs attached to the pf. the eeprom is partially reloaded after an flr reset. the words read from eeprom at flr are the same read a full software reset. 4.2.1.5.2 vf (virtual function) flr (function level reset) an flr reset to a vf function resets all the queues, interrupts, and statistics registers attached to this vf. it also resets the pcie r/w configuration bits allocated to this function. it also disables tx & rx flow for the queues allocated to this vf. all pending read requests are dropped and pcie read completions to this function might be completed as unsupported requests. 4.2.1.5.3 iov (io virtualization) disable
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 165 clearing of the iov enable bit in the iov structure is equivalent to a vflr to all the active vfs in the pf. 4.2.1.6 software reset 4.2.1.6.1 full software reset device reset, rst, can be used to globally reset the entire component. this reset is provided primarily as a last-ditch software mechanism to recover from an indeterminate or suspected hung hardware state. most registers (receive, transmit, interrupt, statistics, etc.), and state machines are set to their power-on reset values, approximating the state following a power-on or pci reset. however, pcie configuration registers are not reset, thereby leaving the device mapped into system memory space and accessible by a driver. one internal configuration register, the packet buffer allocation registers (rxpbs, txpbs & swpbs), also retain their value through a global reset. note: to ensure that global device reset was fully completed and that the 82576 responds to subsequent accesses, wait approximately 1 millisecond after setting before attempting to check if the bit was cleared, or to access (read or write) any other device register. software can reset the 82576 by writing the device reset bit of the device control register (ctrl.rst). the 82576 re-reads part of the per-function eeprom fields after a software reset. bits that are normally read from the eeprom are reset to their default hardware values. fields controlled by the led, sdp & init3 words of the eeprom are not reset and not re-read after a software reset. note: this reset is per function and resets only the function that received the software reset. pci configuration space (configuration and mapping) of the device is unaffected. prior to issuing software reset the driver needs to operate the master disable algorithm as defined in section 5.2.3.2 . 4.2.1.6.2 physical function (pf) software reset a software reset by the pf in iov mode has the same consequences as a software reset in a non-iov mode. the procedure for pf software reset is as follows: ? the pf driver disables master accesses by the device through the master disable mechanism (see section 5.2.3.2 ). master disable affects all vfs traffic. ? execute the procedure described in section 4.5.11.2.3 to synchronize between the pf and vfs. vfs are expected to timeout and check on the vfmailbox.rstd bit in order to identify a pf software reset event. the vfmailbox.rstd bits are cleared on read. 4.2.1.6.3 vf software reset a software reset applied to a vf is equivalent to an flr reset to this vf with the exception that the pcie configuration bits allocated to this function are not reset. this can be activated by setting the vtctrl.rst bit. setting vtctrl.rst resets interrupts and queue enable bits. other vf registers are not reset.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 166 4.2.1.7 force tco this reset is generated when manageability logic is enabled. it is only generated if the reset on force tco bit of the eeprom's management control word is 1. in pass through mode it is generated when receiving a forcetco smbus command with bit 1 or bit 7 set. 4.2.1.8 firmware reset this reset is activated by writing a 1 to the fwr bit in the host interface control register (hicr) in csr address 0x8f00. 4.2.1.9 eeprom reset writing a 1 to the eeprom reset bit of the extended device control register (ctrl_ext.ee_rst) causes the 82576 to re-read the per-function configuration from the eeprom, setting the appropriate bits in the registers loaded by the eeprom. 4.2.1.10 phy reset software can write a 1 to the phy reset bit of the device control register (ctrl.phy_rst) to reset the internal phy. the phy is internally configured after a phy reset. note: the phy should not be reset using phyreg 0 bit 15, as in this case the internal configuration process is bypassed and there is no guarantee the phy will operate correctly. as the phy may be accessed by the internal firmware and the driver software, the driver software should coordinate any phy reset with the firmware using the following procedure: 1. check that manc.blk_phy_rst_on_ide (offset 0x5820 bit 18) is cleared. if it is set, the mc requires a stable link and thus the phy should not be reset at this stage. the driver may skip the phy reset if not mandatory or wait for manc.blk_phy_rst_on_ide to clear. see section 4.2.3 for more details. 2. take ownership of the relevant phy using the following flow: a. get ownership of the software/software semaphore swsm.smbi (offset 0x5b50 bit 0). ? read the swsm register. ? if swsm.smbi is read as zero, the semaphore was taken. ? otherwise, go back to step a. ? this step assure that other software will not access the shared resources register (sw_fw_sync). b. get ownership of the software/firmware semaphore swsm.swesmbi (offset 0x5b50 bit 1): ? set the swsm.swesmbi bit. ? read swsm. ? if swsm.swesmbi was successfully set - the semaphore was acquired - otherwise, go back to step a. ? this step assure that the internal firmware will not access the shared resources register (sw_fw_sync). c. software reads the software-firmware synchronization register (sw_fw_sync) and checks both bits in the pair of bits that control the phy it wishes to own.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 167 ? if both bits are cleared (both firmware and other software does not own the phy), software sets the software bit in the pair of bits that control the resource it wishes to own. ? if one of the bits is set (firmware or other software owns the phy), software tries again later. d. release ownership of the software/firmware semaphore by clearing the swsm.swesmbi bit. 3. drive phy reset bit in ctrl bit 31. 4. wait 100 ? s. 5. release phy reset in ctrl bit 31. 6. release ownership of the relevant phy to the fw using the following flow: a. get ownership of the software/firmware semaphore swsm.swesmbi (offset 0x5b50 bit 1): ? set the swsm.swesmbi bit. ? read swsm. ? if swsm.swesmbi was successfully set - the semaphore was acquired - otherwise, go back to step a. ? clear the bit in sw_fw_sync that control the software ownership of the resource to indicate this resource is free. ? release ownership of the software/firmware semaphore by clearing the swsm.swesmbi bit. 7. wait for the relevant cfg_done bit (eemngctl.cfg_done0 - offset 0x1010 bit 18 or eemngctl.cfg_done1 - offset 0x1010 bit 19). 8. take ownership of the relevant phy using the following flow: a. get ownership of the software/firmware semaphore swsm.swesmbi (offset 0x5b50 bit 1): ? set the swsm.swesmbi bit. ? read swsm. ? if swsm.swesmbi was successfully set - the semaphore was acquired - otherwise, go back to step a. ? this step assure that the internal firmware will not access the shared resources register (sw_fw_sync). b. software reads the software-firmware synchronization register (sw_fw_sync) and checks both bits in the pair of bits that control the phy it wishes to own. ? if both bits are cleared (both firmware and other software does not own the phy), software sets the software bit in the pair of bits that control the resource it wishes to own. ? if one of the bits is set (firmware or other software owns the phy), software tries again later. c. release ownership of the software/software semaphore and the software/firmware semaphore by clearing swsm.smbi and swsm.swesmbi bits. 9. configure the phy. 10. release ownership of the relevant phy using the flow described in section 4.6.2 . 4.2.2 reset effects the resets affect the following registers and logic:
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 168 table 4-4. 82576 reset effects - common resets reset activation internal_power_ on_reset pe_ rst_n in-band pcie reset fw reset notes ltssm (pcie back to detect/ polling) xx x pcie link data path x x x read eeprom (per function) read eeprom (complete load) xx x pci configuration registers- non sticky xx x 3. pci configuration registers - sticky xx x 4. pcie local registers x x x 5. data path x x x on-die memories x x x 4. mac, pcs, auto negotiation, macsec, ipsec xx x virtual function queue enable x x x virtual function interrupt & statistics registers xx x 2. wake up (pm) context x 1 7. wake up control register x 9. wake up status registers x 11. rule checker tables x manageability control registers x 12. mms unit x x wake-up management registers xx x 3. , 13. memory configuration registers xx x 3. eeprom and flash request x 5. phy/serdes phy x x x 2. strapping pins x x x circuit breaker x
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 169 table 4-5. 82576 reset effects - per function resets reset activation d3hot ???? d0 flr full sw reset force tco ee reset phy reset notes read eeprom (per function) xxxxx pci configuration registers ro 3. pci configuration registers , msi-x xx 6. pci configuration registers rw shared 8. pci configuration registers rw xx 9. pcie local registers 5. data path x x x x on-die memories x x x x 4. mac, pcs, auto negotiation, macsec ipsec xxxx wake up (pm) context 7. wake up control register 9. wake up status registers 11. rule checker tables manageability control registers 12. virtual function queue enable xxxx 2. virtual function interrupt & statistics registers xxx 2. wake-up management registers xxxx 3. , 13. memory configuration registers xxxx 3. eeprom and flash request xx 5. phy/serdes phy x x x x 2. strapping pins
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 170 notes: 1. if aux_power = 0b the wakeup context is reset (pme_status and pme_en bits should be 0b at reset if the 82576 does not support pme from d3cold). 2. the mms unit must configure the phy after any phy reset. 3. the following register fields do not follow the general rules above: a. ?ctrl.sdp0_iodir, ctrl.sdp1_iodir, ctrl_ext.sdp2_iodir, ctrl_ext.sdp3_iodir, connsw.enrgsrc field, ctrl_ext.sfp_enable, ctrl_ext.link_mode, ctrl_ext.ext_vlan and led configuration registers are reset on internal_power_on_reset only. any eeprom read resets these fields to the values in the eeprom. b. the aux power detected bit in the pcie device status register is reset on internal_power_on_reset and gio power good only. c. the bits mentioned in the next note. 4. the following registers are part of this group: a. vpd registers b. max payload size field in pcie capability control register (offset 0xa8). c. active state link pm control field, common clock configuration field and extended synch field in pcie capability link control register (offset 0xb0). d. ari enable bit in iov capability command register (offset 0x168). e. read completion boundary in the pcie link control register (offset 0xb0). 5. the following registers are part of this group: a. swsm b. gcr (only part of the bits - see register description for details) c. functag d. gscl_1/2/3/4 e. gscn_0/1/2/3 f. sw_fw_sync - only part of the bits - see register description for details. 6. the following registers are part of this group: a. msix control register, msix pba and msix per vector mask. 7. the wake up context is defined in the pci bus power management interface specification (sticky bits). it includes: a. pme_en bit of the power management control/status register (pmcsr). b. pme_status bit of the power management control/status register (pmcsr). c. aux_en in the pcie registers table 4-6. 82576 reset effects -virtual function resets reset activation vflr 6. software reset notes interrupt registers x x 2. queue disable x x vf specific pcie configuration space x 1. data path
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 171 d. the device requester id (since it is required for the pm_pme tlp). the shadow copies of these bits in the wakeup control register are treated identically. 8. the following fields are part of the pci configuration registers rw shared group: a. captured slot power limit value in the device capabilities register b. captured slot power limit scale in the device capabilities register c. max_payload_size in the device control register d. active state power management (aspm) control in the link control register e. read completion boundary (rcb) in the link control register f. common clock configuration in the link control register g. extended synch in the link control register h. enable clock power management in the link control register i. hardware autonomous width disable bit in link control register j. hardware autonomous speed disable bit in the link control 2 register 9. refers to all the pci configuration registers rw registers not included in notes 8. and 6. 10. refers to bits in the wake up control register that are not part of the wake-up context (the pme_en and pme_status bits). 11. the wake up status registers include the following: a. wake up status register b. wake up packet length. c. wake up packet memory. 12. the manageability control registers refer to the following registers: a. manc 0x5820 b. mfutp01-7 0x5030 - 0x504c c. mfval 0x05824 d. manc2h 0x5860 e. mavtv1-7 0x5010 - 0x502c f. mdef0-7 0x5890 - 0x58ac g. mdef_ext 0x5930 - 0x594c h. metf 0x5060 - 0x506c i. mipaf0-15 0x58b0 - 0x58ec j. mmah/mmal0-3 0x5910 - 0x592c k. fwsm 13. the wake-up management registers include the following: a. wake up filter control b. ip address valid c. ipv4 address table d. ipv6 address table e. flexible filter length table f. flexible filter mask table
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 172 14. the other configuration registers includes: a. general registers b. interrupt registers c. receive registers d. transmit registers e. statistics registers f. diagnostic registers of these registers, mta[n], vfta[n], wupm[n], ffmt[n], ffvt[n], tdbah/tdbal, and rdbah/rdval registers have no default value. if the functions associated with the registers are enabled they must be programmed by software. once programmed, their value is preserved through all resets as long as power is applied to the 82576. note: in situations where the device is reset using the software reset ctrl.rst, the tx data lines is forced to all zeros. this causes a substantial number of symbol errors to be detected by the link partner. in tbi mode, if the duration is long enough, the link partner might restart the auto-negotiation process by sending ?break-link? (/c/ codes with the configuration register value set to all zeros). 1. these registers includes a. msi/msi-x enable bits b. bme c. error indications 2. these registers includes a. vteics b. vteims c. vteiac d. vteiam e. vteitr 0-2 f. vtivar0 g. vtivar_misc h. pbacl i. vfmailbox 3. these registers includes a. rxdctl.enable b. adequate bit in vfte & vfre. 4. the contents of the following memories are cleared to support the requirements of pcie flr: a. the tx packet buffers b. the rx packet buffers c. ipsec tx sa tables d. ipsec rx sa tables 5. includes eec.req, eec.gnt, fla.req and fla.gnt fields. 6. a vflr do not reset the configuration of the vf, only disables the interrupts and the queues.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 173 4.2.3 phy behavior during a manageability session during some manageability sessions (e.g. an ider or sol session as initiated by an external mc ), the platform is reset so that it boots from a remote media. this reset must not cause the ethernet link to drop since the manageability session is lost. also, the ethernet link should be kept on continuously during the session for the same reasons. the 82576 therefore limits the cases in which the internal phy would restart the link, by masking two types of events from the internal phy: ? pe_rst# and pcie resets (in-band and link drop) do not reset the phy during such a manageability session ? the phy does not change link speed as a result of a change in power management state, to avoid link loss. for example, the transition to d3hot state is not propagated to the phy. ? note however that if main power is removed, the phy is allowed to react to the change in power state (i.e., the phy might respond in link speed change). the motivation for this exception is to reduce power when operating on auxiliary power by reducing link speed. the capability described in this section is disabled by default on lan power good reset. the keep_phy_link_up_en bit in the eeprom must be set to '1' to enable it. once enabled, the feature is enabled until the next lan power good (i.e., the 82576 does not revert to the hardware default value on pe_rst#, pcie reset or any other reset but lan power good). when the keep_phy_link_up bit (also known as ?veto bit?) in the manc register is set, the following behaviors are disabled: ? the phy is not reset on pe_rst# and pcie resets (in-band and link drop). other reset events are not affected - lan power good reset, device disable, force tco, and phy reset by software. ? the phy does not change its power state. as a result link speed does not change. ? the 82576 does not initiate configuration of the phy to avoid losing link. the keep_phy_link_up bit is set by the mc through the management control command (see section 10.5 for smbus commands and section 10.6 for nc-si commands) on the sideband interface. it is cleared by the external mc (again, through a command on the sideband interface) when the manageability session ends. once the keep_phy_link_up bit is cleared, the phy updates its dx state and acts accordingly (e.g. negotiates its speed). the keep_phy_link_up bit is also cleared on de-assertion of the main_pwr_ok input pin. main_pwr_ok must be de-asserted at least 1 msec before power drops below its 90% value. this allows enough time to respond before auxiliary power takes over. the keep_phy_link_up bit is a r/w bit and can be accessed by host software, but software is not expected to clear the bit. the bit is cleared in the following cases: ? on lan power good ? when the mc resets or initializes it ? on de-assertion of the main_pwr_ok input pin. the mc should set the bit again if it wishes to maintain speed on exit from dr state.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 174 4.3 function disable 4.3.1 general for a lom (lan on motherboard) design, it might be desirable for the system to provide bios-setup capability for selectively enabling or disabling lan functions. it allows the end-user more control over system resource-management and avoid conflicts with add-in nic solutions. the 82576 provides support for selectively enabling or disabling one or both lan device(s) in the system. 4.3.2 overview device presence (or non-presence) must be established early during bios execution, in order to ensure that bios resource-allocation (of interrupts, of memory or io regions) is done according to devices that are present only. this is frequently accomplished using a bios cvdr (configuration values driven on reset) mechanism. the 82576 lan-disable mechanism is implemented in order to be compatible with such a solution. the 82576 provides two mechanisms to disable lan ports: ? two pins (lanx_dis_n, one per lan port) are sampled on reset to determine the lan-enable configuration ? port 1 might be disabled using eeprom configuration. disabling a lan port affects the pci function it resides on. when function 0 is disabled (either lan0 or lan1), two different behaviors are possible: ? dummy function mode ? in some system, it is required to keep all the functions at their respective location, even when other functions are disabled. in dummy function mode, if function #0 (either lan0 or lan1) is disabled, then it does not disappear from the pcie configuration space. rather, the function presents itself as a dummy function. the device id and class code of this function changes to other values (dummy function device id 0x10a6, class code 0xff0000). in addition, the function does not require any memory or i/o space, and does not require an interrupt line. ? legacy mode ? when function 0 is disabled (either lan0 or lan1), then the port residing on function 1 moves to reside on function 0. function 1 disappears from the pci configuration space. note: in some systems, the dummy function is not recognized by the enumeration process as a valid pci function. in these systems, both ports will not be enumerated and it is recommended to work in legacy mode. the disabled lan port is still available for manageability purposes if it was disabled using the lan_pci_dis bit of the sdp control word in the eeprom or if it was disabled through the pin mechanism and the phy_in_lan_disable bit in the sdp control word in the eeprom is cleared. in this case, and if lplu bit is set, the phy will attempt to create a link at 10 mbps. note: dummy function mode should not be used if sr-iov capability is exposed (since pf0 is required to support certain functionality). sr-iov is enabled by the iov enable bit in eeprom word 0x25 ( section 6.2.24 ).
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 175 mapping between function and lan ports is summarized in the following tables. the following eeprom bits control function disable: ? the access of the host through a pci function to lan1 can be enabled or disabled according to the ?lan pci disable? bit in eeprom word 0x10 ( section 6.2.8 ). ? the ?lan disable select? eeprom field in word 0x10 indicates if port 1 is disabled ( section 6.2.8 ). ? the ?lan function select? bit in eeprom word 0x21 defines the correspondence between lan port and pci function ( section 6.2.22 ) ? the ?dummy function enable? bit in eeprom word 0x1b enables the dummy function mode. default value is disabled ( section 6.2.18 ). ? the ?phy_in_lan_disable? bit in eeprom words 0x10 and 0x20 controls the availability of the disabled function to manageability channel when disabled through the lan0_dis_n or lan1_dis_n pins ( section 6.2.8 and section 6.2.9 ). when a particular lan is fully disabled, all internal clocks to that lan are disabled, the device is held in reset, and the internal phy for that lan is powered-down. in both modes, the device does not respond to pci configuration cycles. effectively, the lan device becomes invisible to the system from both a configuration and power-consumption standpoint. table 4-7. pci functions mapping (legacy mode) pci function # lan function select function 0 function 1 both lan functions are enabled 0 lan 0 lan 1 1 lan 1 lan 0 lan 0 is disabled x lan1 disable lan 1 is disabled x lan 0 disable both lan functions are disabled both pci functions are disabled. device is in low power mode. table 4-8. pci functions mapping (dummy function mode) pci function # lan function select function 0 function 1 both lan functions are enabled 0 lan 0 lan 1 1 lan 1 lan 0 lan 0 is disabled 0 dummy lan1 1 lan 1 disable lan 1 is disabled 0 lan 0 disable 1 dummy lan 0 both lan functions are disabled both pci functions are disabled. device is in low power mode.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 176 4.3.3 control options the functions have a separate enabling mechanism. any function that is not enabled does not function and does not expose its pci configuration registers. 4.3.3.1 pci functions disable options the 82576 strapping option for lan disable feature: 4.3.4 event flow for enable/disable functions this section describes the driving levels and event sequence for device functionality. following a power on reset / internal power / pe_rst_n/ in-band reset the lanx_dis_n signals should be driven hi (or left open) for nominal operation. if any of the lan functions are not required statically its associated disable strapping pin can be tied statically to low. case a - bios disable the lan function at boot time by using strapping: 1. assume that following power up sequence lanx_dis_n signals are driven high. 2. the pcie is established following the perst. 3. bios recognize that a lan function in the 82576 should be disabled. 4. the bios drive the lanx_dis_n signal to the low level. 5. the bios should assert the pcie reset, either in-band or via pe_rst_n. 6. as a result, the 82576 samples the lanx_dis_n signals and disable the lan function and issue an internal reset to this function. 7. bios might start with the device enumeration procedure (the disabled lan function is invisible or changed to dummy function). 8. proceed with nominal operation. table 4-9. strapping for control options function control options lan 0 strapping option + eeprom word 0x20 bit 13 (full/pci only disable in case of strap) lan 1 strapping option + eeprom word 0x10 bit 13 (full/pci only disable in case of strap)/ eeprom word 0x10 bit 11 (full disable) / eeprom word 0x10 bit 10 (pci only disable) table 4-10. strapping for lan disable symbol ball # name and function lan0_dis_n b13 this pin is a strapping option pin always active. this pin has an internal weak pull-up resistor. in case this pin is not connected or driven hi during init time, lan 0 is enabled. in case this pin is driven low during init time, lan 0 is disabled. this pin is also used for testing and scan. when used for testing or scan, the lan disable functionality is not active. lan1_dis_n a15 this pin is a strapping option pin always active. this pin has an internal weak pull-up resistor. in case this pin is not connected or driven hi during init time, lan 1 is enabled. in case this pin is driven low during init time, lan 1 function is disabled. this pin is also used for testing and scan. when used for testing or scan, the lan disable functionality is not active.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 177 9. re-enable could be done by driving the lanx_dis_n signal high and then request the user to issue a warm boot that generate bus enumeration. 4.3.4.1 multi-function advertisement if one of the lan devices is disabled, the 82576 no longer is a multi-function device. the 82576 normally reports a 0x80 in the pci configuration header field header type, indicating multi-function capability. however, if a lan is disabled, the 82576 reports a 0x0 in this field to signify single-function capability. 4.3.4.2 legacy interrupts utilization when both lan devices are enabled, the 82576 can utilizes inta# to intc# interrupts for interrupt reporting. the eeprom initialization control word 3 (bits 12:11) associated with each lan device controls which of these interrupts are used for each lan device. the specific interrupt pin utilized is reported in the pci configuration header interrupt pin field associated with each lan device. however, if only one lan device is enabled, then the inta# must be used for this lan device, regardless of the eeprom configuration. under these circumstances, the interrupt pin field of the pci header always reports a value of 0x1, indicating inta# usage. 4.3.4.3 power reporting when both lan devices are enabled, the pci power management register block has the capability of reporting a ?common power? value. the common power value is reflected in the data field of the pci power management registers. the value reported as common power is specified via eeprom, and is reflected in the data field whenever the data_select field has a value of 0x8 (0x8 = common power value select). when only one lan is enabled, the 82576 appears as a single-function device, the common power value, if selected, reports 0x0 (undefined value), as common power is undefined for a single-function device. 4.4 device disable for a lom design, it might be desirable for the system to provide bios-setup capability for selectively enabling or disabling lom devices. this might allow the end-user more control over system resource- management; avoid conflicts with add-in nic solutions, etc. the 82576 provides support for selectively enabling or disabling it. note: if the 82576 is configured to provide a 50mhz nc-si clock (via the nc-si output clock eeprom bit), then the device should not be disabled. device disable is initiated by assertion of the asynchronous dev_off_n pin. the dev_off_n pin should always be connected to enable correct device operation. the eeprom "power down enable" bit ( section 6.2.7 ) enables device disable mode (hardware default is that the mode is disabled). while in device disable mode, the pcie link is in l3 state.the phy is in power down mode. output buffers are tri-stated.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 178 assertion or de-assertion of pcie pe_rst_n does not have any effect while the device is in device disable mode (i.e., the device stays in the respective mode as long as dev_off_n is asserted). however, the device might momentarily exit the device disable mode from the time pcie pe_rst_n is de-asserted again and until the eeprom is read. during power-up, the dev_off_n pin is ignored until the eeprom is read. from that point, the device might enter device disable if dev_off_n is asserted. note: de-assertion of the dev_off_n pin causes a fundamental reset to the 82576. note to system designer: the dev_off_n pin should maintain its state during system reset and system sleep states. it should also insure the proper default value on system power-up. for example, one could use a gpio pin that defaults to '1' (enable) and is on system suspend power (i.e., it maintains state in s0-s5 acpi states). 4.4.1 bios handling of device disable 1. assume that following power up sequence the dev_off_n signal is driven high (else it is already disabled). 2. the pcie is established following the perst. 3. bios recognize that the whole device should be disabled. 4. the bios drive the dev_off_n signal to the low level. 5. as a result, the 82576 samples the dev_off_n signal and enters the device disable mode. 6. the bios put the link in the electrical idle state (at the other end of the pcie link) by clearing the link disable bit in the link control register. 7. bios might start with the device enumeration procedure (all of the device functions are invisible). 8. proceed with nominal operation. 9. re-enable could be done by driving the dev_off_n signal high followed later by bus enumeration. 4.5 software initialization and diagnostics 4.5.1 introduction this chapter discusses general software notes for the 82576, especially initialization steps. this includes general hardware, power-up state, basic device configuration, initialization of transmit and receive operation, link configuration, software reset capability, statistics, and diagnostic hints. 4.5.2 power up state when the 82576 powers up it reads the eeprom. the eeprom contains sufficient information to bring the link up and configure the 82576 for manageability and/or apm wakeup. however, software initialization is required for normal operation. the power-up sequence, as well as transitions between power states, are described in section 4.1.1 . the detailed timing is given in section 5.5 . the next section gives more details on configuration requirements.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 179 4.5.3 initialization sequence the following sequence of commands is typically issued to device by the software device driver in order to initialize the 82576 to normal operation. the major initialization steps are: ? disable interrupts - see interrupts during initialization. ? issue global reset and perform general configuration - see global reset and general configuration. ? setup the phy and the link - see link setup mechanisms and control/status bit summary. ? initialize all statistical counters - see initialization of statistics. ? initialize receive - see receive initialization. ? initialize transmit - see transmit initialization. ? enable interrupts - see interrupts during initialization. 4.5.4 interrupts during initialization ? most drivers disable interrupts during initialization to prevent re-entering to the interrupt routine. interrupts are disabled by writing to the imc register. note that the interrupts need to be disabled also after issuing a global reset, so a typical driver initialization flow is: ? disable interrupts ? issue a global reset ? disable interrupts (again) ?? after the initialization is done, a typical driver enables the desired interrupts by writing to the ims register. 4.5.5 global reset and general configuration device initialization typically starts with a global reset that puts the device into a known state and enables the device driver to continue the initialization sequence. several values in the device control register (ctrl) need to be set, upon power up, or after a device reset for normal operation. ? fd should be set per interface negotiation (if done in software), or is set by the hardware if the interface is auto-negotiating. this is reflected in the device status register in the auto-negotiating case. ? speed is determined via auto-negotiation by the phy, auto-negotiation by the pcs layer in sgmii/ serdes mode, or forced by software if the link is forced. status information for speed is also readable in status. ? ilos should normally be set to 0. set the packet buffer allocation for transmit and receive flows in the rxpbs, txpbs & swpbs registers. this should be done before rctl.rxen & tctl.txen are set. an ordered disabling of all queues and of the rx & tx flows is required before any change in the packet buffer allocation is done.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 180 4.5.6 flow control setup if flow control is enabled, program the fcrtl, fcrth, fcttv and fcrtv registers. in order to avoid packet losses, fcrth should be set to a value equal to at least two max size packet below the receive buffer size. e.g. assuming a packet buffer size of 32k and expected max size packet of 9.5k, the fcrth value should be set to 32 - 2 * 9.5 = 14k i.e. rth should be set to 0x380. 4.5.7 link setup mechanisms and control/status bit summary note: the ctrl_ext.link_mode value should be set to the desired mode prior to the setting of the other fields in the link setup procedures. 4.5.7.1 phy initialization refer to the phy documentation for the initialization and link setup steps. the device driver uses the mdic register to initialize the phy and setup the link. section 3.5.4.3 describes the link setup for the internal copper phy. section 3.5.2.2 describes the usage of the mdic register. 4.5.7.2 mac/phy link setup (ctrl_ext.link_mode = 00) this section summarizes the various means of establishing proper mac/phy link setups, differences in mac ctrl register settings for each mechanism, and the relevant mac status bits. the methods are ordered in terms of preference (the first mechanism being the most preferred). 4.5.7.2.1 mac settings automatically based on duplex and speed resolved by phy (ctrl.frcdplx = 0b, ctrl.frcspd = 0b,) ctrl.fd don't care; duplex setting is established from phy's internal indication to the mac (fdx) after phy has auto-negotiated a successful link-up. ctrl.slu must be set to 1 by software to enable communications between mac and phy. ctrl.rfce must be set by s/w after reading flow control resolution from phy registers. ctrl.tfce must be set by s/w after reading flow control resolution from phy registers. ctrl.speed don't care; speed setting is established from phy's internal indication to the mac (spd_ind) after phy has auto-negotiated a successful link-up. status.fd reflects the actual duplex setting (fdx) negotiated by the phy and indicated to mac. status.lu reflects link indication (link) from phy qualified with ctrl.slu (set to 1). status.speed reflects actual speed setting negotiated by the phy and indicated to the mac (spd_ind). 4.5.7.2.2 mac duplex and speed settings forced by software based on resolution of phy (ctrl.frcdplx = 1b, ctrl.frcspd = 1b) ctrl.fd set by software based on reading phy status register after phy has auto- negotiated a successful link-up.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 181 ctrl.slu must be set to 1 by software to enable communications between mac and phy. ctrl.rfce must be set by s/w after reading flow control resolution from phy registers. ctrl.tfce must be set by s/w after reading flow control resolution from phy registers. ctrl.speed set by software based on reading phy status register after phy has auto- negotiated a successful link-up. status.fd reflects the mac forced duplex setting written to ctrl.fd. status.lu reflects link indication (link) from phy qualified with ctrl.slu (set to 1). status.speed reflects mac forced speed setting written in ctrl.speed. 4.5.7.2.3 mac/phy duplex and speed settings both forced by software (fully-forced link setup) (ctrl.frcdplx = 1b, ctrl.frcspd = 1b, ctrl.slu = 1b) ctrl.fd set by software to desired full/half duplex operation (must match duplex setting of phy). ctrl.slu must be set to 1 by software to enable communications between mac and phy. phy must also be forced/configured to indicate positive link indication (link) to the mac. ctrl.rfce must be set by s/w to desired flow-control operation (must match flow-control settings of phy). ctrl.tfce must be set by s/w to desired flow-control operation (must match flow-control settings of phy). ctrl.speed set by software to desired link speed (must match speed setting of phy). status.fd reflects the mac duplex setting written by software to ctrl.fd. status.lu reflects 1 (positive link indication link from phy qualified with ctrl.slu). note that since both ctrl.slu and the phy link indication link are forced, this bit set does not guarantee that operation of the link has been truly established. status.speed reflects mac forced speed setting written in ctrl.speed. 4.5.7.3 mac/serdes link setup (ctrl_ext.link_mode = 11b) link setup procedures using an external serdes interface mode: 4.5.7.3.1 hardware auto-negotiation enabled (pcs_lctl. an enable = 1b; ctrl.frcspd = 0b; ctrl.frcdplx = 0) ctrl.fd ignored; duplex is set by priority resolution of pcs_andv and pcs_lpab. ctrl.slu must be set to 1 by software to enable communications to the serdes. ctrl.rfce set by hardware according to auto negotiation resolution 1 . ctrl.tfce set by hardware according to auto negotiation resolution 1 .
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 182 ctrl.speed ignored; speed always 1000mb/s when using sgmii mode communications. status.fd reflects hardware-negotiated priority resolution. status.lu reflects pcs_lsts.an complete (auto-negotiation complete). status.speed reflects 1000mb/s speed, reporting fixed value of (10)b. pcs_lctl.fsd must be zero. pcs_lctl.force flow control must be zero 1 . pcs_lctl.fsv must be set to 10b. only 1000 mb/s is supported in serdes mode. pcs_lctl.fdv ignored; duplex is set by priority resolution of pcs_andv and pcs_lpab. 4.5.7.3.2 auto-negotiation skipped (pcs_lctl. an enable = 0b; ctrl.frcspd = 1b; ctrl.frcdplx = 1) ctrl.fd must be set to 1b. - only full duplex is supported in serdes mode. ctrl.slu must be set to 1 by software to enable communications to the serdes. ctrl.rfce set by software for the desired mode of operation. ctrl.tfce set by software for the desired mode of operation. ctrl.speed must be set to 10b. only 1000 mb/s is supported in serdes mode. status.fd reflects the value written by software to ctrl.fd. status.lu reflects whether the pcs detected comma symbols, qualified with ctrl.slu (set to 1). status.speed reflects 1000mb/s speed, reporting fixed value of (10)b. pcs_lctl.fsd must be set to 1 by software to enable communications to the serdes. pcs_lctl.force flow control must be set to 1. pcs_lctl.fsv must be set to 10b. only 1000 mb/s is supported in serdes mode. pcs_lctl.fdv must be set to 1b - only full duplex is supported in serdes mode. 4.5.7.4 mac/sgmii link setup (ctrl_ext.link_mode = 10b) link setup procedures using an external sgmii interface mode: 4.5.7.4.1 hardware auto-negotiation enabled (pcs_lctl. an enable = 1b, ctrl.frcdplx = 0b, ctrl.frcspd = 0b) ctrl.fd ignored; duplex is set by priority resolution of pcs_andv and pcs_lpab. 1. if pcs_lctl.force flow control is set, the auto negotiation result is not reflected in the ctrl.rfce and ctrl.tfce registers. in this case, the software must set these fields after reading flow control resolution from pcs registers.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 183 ctrl.slu must be set to 1 by software to enable communications to the serdes. ctrl.rfce must be set by software after reading flow control resolution from pcs registers. ctrl.tfce must be set by software after reading flow control resolution from pcs registers. ctrl.speed ignored; speed setting is established from sgmii's internal indication to the mac after sgmii has auto-negotiated a successful link-up. status.fd reflects hardware-negotiated priority resolution. status.lu reflects pcs_lsts.link ok status.speed reflects actual speed setting negotiated by the sgmii and indicated to the mac. pcs_lctl.force flow control ignored. pcs_lctl.fsd should be set to zero. pcs_lctl.fsv ignored; speed is set by priority resolution of pcs_andv and pcs_lpab. pcs_lctl.fdv ignored; duplex is set by priority resolution of pcs_andv and pcs_lpab. 4.5.8 initialization of statistics statistics registers are hardware-initialized to values as detailed in each particular register's description. the initialization of these registers begins upon transition to d0active power state (when internal registers become accessible, as enabled by setting the memory access enable of the pcie command register), and is guaranteed to be completed within 1 ? sec. of this transition. access to statistics registers prior to this interval might return indeterminate values. all of the statistical counters are cleared on read and a typical device driver reads them (thus making them zero) as a part of the initialization sequence. 4.5.9 receive initialization program the receive address register(s) per the station address. this can come from the eeprom or from any other means (for example, on some machines, this comes from the system prom not the eeprom on the adapter card). set up the mta (multicast table array) per software. this means zeroing all entries initially and adding in entries as requested. program rctl with appropriate values. if initializing it at this stage, it is best to leave the receive logic disabled (en = 0b) until after the receive descriptor ring has been initialized. if vlans are not used, software should clear vfe. then there is no need to initialize the vfta. select the receive descriptor type. the following should be done once per receive queue needed: ? allocate a region of memory for the receive descriptor list. ? receive buffers of appropriate size should be allocated and pointers to these buffers should be stored in the descriptor ring. ? program the descriptor base address with the address of the region.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 184 ? set the length register to the size of the descriptor ring. ? program srrctl of the queue according to the size of the buffers and the required header handling. ? if header split or header replication is required for this queue, program the psrtype register according to the required headers. ? enable the queue by setting rxdctl.enable. in the case of queue zero, the enable bit is set by default - so the ring parameters should be set before rctl.rxen is set. ? poll the rxdctl register until the enable bit is set. the tail should not be bumped before this bit was read as one. ? program the direction of packets to this queue according to the mode select in mrqc. packets directed to a disabled queue is dropped. note: the tail register of the queue (rdt[n]) should not be bumped until the queue is enabled. 4.5.9.1 initialize the receive control register to properly receive packets the receiver should be enabled by setting rctl.rxen. this should be done only after all other setup is accomplished. if software uses the receive descriptor minimum threshold interrupt, that value should be set. 4.5.9.2 dynamic enabling and disabling of receive queues receive queues can be dynamically enabled or disabled given the following procedure is followed: enabling: ? follow the per queue initialization described in the previous section. ? note that if there are still packets in the packet buffer directed to this queue according to previous settings, they is received after the queue is re-enabled. in order to avoid this condition, the software might poll the pbrwac register. once two wrap-arounds or an empty condition of the relevant packet buffer is detected, the queue might be re-enabled. disabling: ? disable the direction of packets to this queue. ? disable the queue by clearing rxdctl.enable. the 82576 stops fetching and writing back descriptors from this queue immediately. the 82576 eventually completes the storage of one buffer allocated to this queue. any further packet directed to this queue is dropped. if the currently processed packet is spread over more than one buffer, all subsequent buffers is not written. ? the 82576 clears rxdctl.enable only after all pending memory accesses to the descriptor ring or to the buffers are done. the driver should poll this bit before releasing the memory allocated to this queue. the rx path might be disabled only after all rx queues are disabled. 4.5.10 transmit initialization program the tctl register according to the mac behavior needed. if work in half duplex mode is expected, program the tctl_ext.cold field. for internal phy mode the default value of 0x41 is ok. for sgmii mode, a value reflecting the 82576 and the phy sgmii delays should be used. a suggested value for a typical phy is 0x46 for 10 mbps and 0x4c for 100 mbps.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 185 the following should be done once per transmit queue: ? allocate a region of memory for the transmit descriptor list. ? program the descriptor base address with the address of the region. ? set the length register to the size of the descriptor ring. ? program the txdctl register with the desired tx descriptor write back policy. suggested values are: ? wthresh = 1b ? all other fields 0b. ? if needed, set the tdwbal/twdbah to enable head write back ? enable the queue using txdctl.enable (queue zero is enabled by default). ? poll the txdctl register until the enable bit is set. note: the tail register of the queue (tdt[n]) should not be bumped until the queue is enabled. enable transmit path by setting tctl.en. this should be done only after all other settings are done. 4.5.10.1 dynamic queue enabling and disabling transmit queues can be dynamically enabled or disabled given the following procedure is followed: enabling: ? follow the per queue initialization described in the previous section. disabling: ? stop storing packets for transmission in this queue. ? wait until the head of the queue (tdh) is equal to the tail (tdt), i.e. the queue is empty. ? disable the queue by clearing txdctl.enable. the tx path might be disabled only after all tx queues are disabled. 4.5.11 virtualization initialization flow 4.5.11.1 next generation vmdq mode 4.5.11.1.1 global filtering and offload capabilities ? select one of the next generation vmdq pooling methods - mac/vlan filtering for pool selection and rss for the queue in pool selection. mrqc.multiple receive queues enable = 011b, 100b or 101b. ? in rss mode, the rss key (rssrk) and redirection table (reta) should be programmed. note that the redirection table is common to all the pools and only indicates the queue inside the pool to use once the pool is chosen. ? set the rplolr and rplpsrtype registers to define the behavior of replicated packets. ? configure vt_ctl.def_pl to define the default pool. if packets with no pools should be dropped, set vt_ctl.dis_def_pool field. ? enable replication via vt_ctl.replication_en.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 186 ? enable loopback via dtxswc.loopback. ? if needed, enable padding of small packets via the rctl.psp 4.5.11.1.2 mirroring rules. for each mirroring rule to be activated: a. set the type of traffic to be mirrored in the vmrctl[n] register. b. set the mirror pool in the vmrctl[n].mp c. for pool mirroring, set the vmrvm[n] register with the pools to be mirrored. d. for vlan mirroring, set the vmvrlan[n] with the indexes from the vlvf registers of the vlans to be mirrored. 4.5.11.1.3 per pool settings as soon as a pool of queues is associated to a vm the software should set the following parameters: 1. address filtering: a. the unicast mac address of the vm by enabling the pool in the rah/ral registers. b. if all the mac addresses are used, the unicast hash table (uta) can be used. pools servicing vms whose address is in the hash table should be declared as so by setting the vmolr.rope. packets received according to this method didn?t pass perfect filtering and are indicated as such. c. enable the pool in all the rah/ral registers representing the multicast mac addresses this vm belongs to. d. if all the mac addresses are used, the multicast hash table (mta) can be used. pools servicing vms using multicast addresses in the hash table should be declared as so by setting the vmolr.rompe. packets received according to this method didn?t pass perfect filtering and are indicated as such. e. define whether this vm should get all multicast/broadcast packets in the same vlan via the vmolr.mpe and vmolr.bam fields f. enable the pool in each vlvf register representing a vlan this vm belongs to. g. a vm might be set to receive it?s own traffic in case the source and the destination are in the same pool via the dtxswc.lle field. h. define whether the pool belongs to the default vlan and should accept untagged packets via the vmolr.aupe field 2. offloads a. define whether vlan header should be stripped from the packet. crc is always stripped from the packet. b. set which header split is required via the psrtype register. c. set whether larger than standard packet are allowed by the vm and what is the largest packet allowed (jumbo packets support) via vmolr.rlpml & vmolr.rle. d. in rss mode, define if the pool uses rss via the vmolr.rsse bit. 3. queues a. enable rx & tx queues as described in section 4.5.9 & section 4.5.10
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 187 b. for each rx queue a drop/no drop flag can be set in srrctl.drop_en or via the qde register, controlling the behavior in cases no receive buffers are available in the queue to receive packets. the usual behavior is to allow drops in order to avoid head of line blocking, unless a no-drop behavior is needed for some type of traffic (e.g. storage). 4.5.11.1.4 security features 4.5.11.1.4.1 anti spoofing for each pool, the driver may activate the mac and vlan anti spoof features via the relevant bit in dtxswc.macas and dtxswc.vlanas respectively. 4.5.11.1.4.2 storm control the driver may set limits to the broadcast or multicast traffic it can receive. 1. it should set how many 64 bytes chunks of broadcast and multicast traffic are acceptable per interval via the bsctrh and msctrh respectively. 2. it should then set the interval to be used via the sccrl.interval field and which action to take when the broadcast or multicast traffic crosses the programmed threshold via the sccrl.bdipw, sccrl.bdicw, sccrl.mdipw, and sccrl.mdicw fields. 3. the driver may be notified of storm control events through the icr.sce interrupt cause. 4.5.11.1.5 allocation of tx bandwidth to vms 4.5.11.1.5.1 configuring tx bandwidth to vms allocation of tx bandwidth to vms feature is enabled or disabled via the programming of vmbacs and vmbammw registers. when enabled, bandwidth to vms (i.e. to tx queues) is configured via writing into vmbasel and vmbac registers for each queue again. the bandwidth configuring procedure is as follow - 1. allocate non-null rates to vms present in the system rvmi (i=0..7), in gb/s units, so that: rvm0 + rvm1 + ... + rvm7 = 0.5 gb/s assume also that for any different i,j: rvmi / rvmj < 10 and rvmj / rvmi < 10 2. allocate rates to enabled queues rqi (i=0..7), in gb/s units, so that: rq i = rq i+8 = rvm i / 2 3. compute rate factors rfqi (i=0..15) for all the enabled tx queues, so that: rfqi = 1 gb/s / rqi 4. format the rate factors obtained in the previous step as decimal binary numbers, with 10-bits integral part left of the decimal point, and 14-bits decimal part right of it, and for i=0..15, set rttdqsel.txdq_idx=i and then: a. set rttdvmrc.rf_int = integral part of rfqi b. set rttdvmrc.rf_dec = decimal part of rfqi
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 188 5. compute vm_mmw_size to the vm rate-scheduler as follow: vm_mmw_size = 16 x mss for avoiding saturation while full workload. refer to section 4.5.11.1.5.2 . 6. set vmbammw.mmw_size = vm_mmw_size 4.5.11.1.5.2 link speed change procedure whenever the link status or speed is changed, the 82576 operates the vm arbiters in a packet based round robin mode, and disables the vm rate-controllers. software is responsible to re-enabling and re- configuring them accordingly to the new link speed. however, to avoid any race condition between hardware and software, the following procedure must be performed by the driver whenever a link speed/status change interrupt occurs: 1. check the speed_chg bit in vmbacs register was asserted by hardware. 2. read the vmba_set bit in the vmbacs register. 3. if the bit is read as 1, it means the vm rate-controllers were not completely disabled by hardware (i.e. a race occurred between hardware and software). software must therefore clear the rc_ena bit in the vmbac register for all the queues, or for at least the queue(s) for which it is still set. 4. clear the speed_chg bit in vmbacs register. 4.5.11.2 iov initialization the initialization flow used to enable an iov function can be found in chapter 2 of the pci-express single root i/o virtualization and sharing specification. 4.5.11.2.1 pf driver initialization the pf driver is responsible for the link setup and handling of all the filtering & off load capabilities for all the vfs as described in section 4.5.11.1.1 and the security features as described in section 4.5.11.1.4 . it should also set the bandwidth allocation per transmit queue for each vf as described in section 4.5.10 and section 4.5.11.1.5 . note: the link setup might include authentication process (802.1x or other);setup of the of the macsec channel . in iov mode, next generation vmdq + rss mode is not available. rss mode might be used, but this assumes all the vms uses the same key, rss hash algorithms and redirection table which is currently not por of any vmm vendor. after all the common parameters are set, the pf driver should set all the vfmailbox.rstd bit by setting the ctrl.pfrstd. the pf might disable all active vf traffic (via the vfte & vfre registers) until the parameters of a vf are set; see section 4.5.11.1.3 . vfs can be enabled using the same registers.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 189 4.5.11.2.1.1 vf specific reset coordination after the pf driver receives an indication of a vf flr via the vflre register, it should enable the receive and transmit for the vf only once the device is programmed with the right parameters as defined in section 4.5.11.1.3 . the receive filtering is enabled using the vfre register and the transmit filtering is enabled via the vfte register. note: the filtering & off loads setup might be based on a central it settings or on requests from the vf drivers. the pf driver should assert the vf reset via the vctrl register before configuration of the vf parameters. 4.5.11.2.2 vf driver initialization upon init, after the pf indicated that the global init was done via the vfmailbox.rstd bit, the vf driver should communicate with the pf, either via the mailbox or other software mechanisms to assure that the right parameters of the vf are programmed as described in section 4.5.11.1.3 . the mailbox mechanism is described in section 7.10.2.9.1 . the pf should also setup the security measures as described in section 4.5.11.1.4 . in addition, the pf may also program whether the vf is allowed to control vlan insertion or whether vlan insertion is controlled by the pf via the relevant vmvir register. the pf driver might then send an acknowledge message with the actual setup done according to the vf request and the it policy. the vf driver should then setup the interrupts and the queues as described in section 4.5.9 & section 4.5.10 . 4.5.11.2.3 full reset coordination a mechanism is provided to synchronize reset procedures between the physical function and the vfs. it is provided specifically for pf software reset but can be used in other reset cases as described below. the procedure is as follows: one of the following reset cases takes place: ? internal_power_on_reset ? pcie reset (perst# and in-band) ? d3hot --> d0 ? flr ? software reset by the pf the 82576 sets the rsti bits in all the vfmailbox registers. once the reset completes, each vf might read its vfmailbox register to identify a reset in progress. once the pf completed configuring the device, it sets the ctrl_ext.pfrstd bit. as a result, the 82576 clears the rsti bits in all the vfmailbox registers and sets the rstd (reset done) bits are set in all the vfmailbox registers.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 190 until a rstd condition is detected, the vfs should access only the vfmailbox register and should not attempt to activate the interrupt mechanism or the transmit and receive process. 4.5.11.2.4 iov disable after iov is disabled, the pf can not immediately reuse the resources released by the vf. it should first wait 100 ms, to make sure all the pending request and completions of the defunct vfs are processed. after that, it should set the iovctl.use vf queues bit. only then, the released queues may be reused by the pf. 4.5.11.2.5 vfre/vfte this mechanism insures that a vf cannot transmit or receive before the tx and rx path have been initialized by the pf. it is required for vflr reset and must also be used in case of vf software reset. it is optional for pf software reset as described above. the vfre register contains a bit per vf. when the bit is cleared assignment of rx packet for the vf?s pool is disabled. when set, assignment of rx packet for the vf?s pool is enabled. the vfte register contains a bit per vf. when the bit is cleared, fetching of data for the vf?s pool is disabled. when set, fetching of data for the vf?s pool is enabled. fetching of descriptors for the vf pool is maintained, up to the limit of the internal descriptor queues - regardless to vfte settings. note: the vfre and vfte registers apply in all device modes (not just iov). the default values for both registers are therefore ?1?, enabling transmission and reception in non-iov modes. 4.5.12 transmit rate limiting configuration 4.5.12.1 link speed change procedure whenever the link status or speed is changed, the 82576 disables the rate-schedulers. software is responsible to re-enabling and re-configuring them accordingly to the new link speed. however, to avoid any race condition between hardware and software, the following procedure must be performed by the driver whenever a link speed/status change interrupt occurs: 1. check the speed_chg bit in trldcs registers was asserted by hardware. 2. read the trl_rs_set bit in the trldcs register. 3. if the bit is read as 1, it means the rate-schedulers were not completely disabled by hardware (i.e. a race occurred between hardware and software). software must therefore clear the rs_ena bit in the trlrc register for all the queues, or for at least the queue(s) for which it is still set. 4. clear the speed_chg bit in trldcs register. 5. set the appropriate link_speed field in trldc register. 4.5.12.2 configuration flow at the initialization stage, the following registers shall be configured: ? tx rate-limiter mmw (trlmmw) with typically mmw_size=0x014 if 9500 bytes jumbo is supported over the tc, 0x004 otherwise ? tx rate-limiter control register (trlcr) the driver will update the rtl parameters of the concerned tx queue, on the fly, as follows:
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 191 ? tx descriptor plane queue select (trldqsel.txq_idx) with the index of the rate-limited queue ? tx rate-limiter rate config (trlrc), rs_ena=1 and with desired maximum rate 4.5.12.3 configuration rules setting a rate limiter on tx queue n to a targetrate requires the following settings: ? select the requested queue by programming the queue index - trldqsel.txq_idx ? program the desired rate as follow ? compute the rate_factor which equals link_speed / target_rate. link_speed could be either 1 gb/s or 100 mb/s. note that the rate_factor is composed of an integer number plus a fraction. the integer part is a 10 bit number field and the fraction part is a 14 bit binary fraction number. ? integer (rate_factor) is programmed by the trlrc.rf_int[9:0] field ? fraction (rate_factor) is programmed by the trlrc.rf_dec[13:0] field. it equals rf_dec[13] * 2-1 + rf_dec[12] * 2-2 + ... + rf_dec[0] * 2-14 ? enable rate scheduler by setting the trlrc. rs_ena numerical example ? target_rate = 24 mb/s ; link_speed = 1 gb/s ? rate_factor = 1 / 0.024 = 41.6666... = 101001.10101010101011b ? rf_dec = 10101010101011b ; rf_int = 0000101001b ? therefore, set trlrc to 0x800a6aab 4.6 access to shared resources part of the resources in the 82576 are shared between several software entities - namely the drivers of the two ports and the internal firmware. in order to avoid contentions, a driver that needs to access one of these resources should use the flow described in section 4.6.1 in order to acquire ownership of this resource and use the flow described in section 4.6.2 in order to relinquish ownership of this resource. the shared resources are: 1. the eeprom 2. both phys 3. csrs accessed by the internal firmware after the initialization process. currently there are no such csrs. 4. the flash. note: any other software tool that access the the 82576 register set directly should also follow the flow described below. 4.6.1 acquiring ownership over a shared resource the following flow should be used to acquire a shared resource: 1. get ownership of the software/software semaphore swsm.smbi (offset 0x5b50 bit 0).
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 192 a. read the swsm register. b. if swsm.smbi is read as zero, the semaphore was taken. c. otherwise, go back to step a. this step assure that other software will not access the shared resources register (sw_fw_sync). 2. get ownership of the software/firmware semaphore swsm.swesmbi (offset 0x5b50 bit 1): a. set the swsm.swesmbi bit. b. read swsm. c. if swsm.swesmbi was successfully set - the semaphore was acquired - otherwise, go back to step a. this step assure that the internal firmware will not access the shared resources register (sw_fw_sync). 3. software reads the software-firmware synchronization register (sw_fw_sync) and checks both bits in the pair of bits that control the resource it wishes to own. a. if both bits are cleared (both firmware and other software does not own the resource), software sets the software bit in the pair of bits that control the resource it wishes to own. b. if one of the bits is set (firmware or other software owns the resource), software tries again later. 4. release ownership of the software/software semaphore and the software/firmware semaphore by clearing swsm.smbi and swsm.swesmbi bits. 5. at this stage, the shared resources is owned by the driver and it may access it. the swsm and sw_fw_sync registers can now be used to take ownership of another shared resources.
initialization ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 193 4.6.2 releasing ownership over a shared resource the following flow should be used to release a shared resource: 1. get ownership of the software/software semaphore swsm.smbi (offset 0x5b50 bit 0). a. read the swsm register. b. if swsm.smbi is read as zero, the semaphore was taken. c. otherwise, go back to step a. this step assure that other software will not access the shared resources register (sw_fw_sync). 2. get ownership of the software/firmware semaphore swsm.swesmbi (offset 0x5b50 bit 1): a. set the swsm.swesmbi bit. b. read swsm. c. if swsm.swesmbi was successfully set - the semaphore was acquired - otherwise, go back to step a. this step assure that the internal firmware will not access the shared resources register (sw_fw_sync). 3. clear the bit in sw_fw_sync that control the software ownership of the resource to indicate this resource is free. 4. release ownership of the software/software semaphore and the software/firmware semaphore by clearing swsm.smbi and swsm.swesmbi bits. 5. at this stage, the shared resources is released by the driver and it may not access it. the swsm and sw_fw_sync registers can now be used to take ownership of another shared resources.
intel ? 82576eb gbe controller ? initialization intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 194 note: this page intentionally left blank.
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 195 5.0 power management this section describes how power management is implemented in the 82576. the 82576 supports the advanced configuration and power interface (acpi) specification as well as advanced power management (apm). note: power management can be disabled via the power management bit in the initialization control word 1 eerprom word (see section 6.2.2 ) . 5.1 general power state information 5.1.1 pci device power states the pcie specification defines function power states (d-states) that enable the platform to establish and control power states for the 82576 ranging from fully on to fully off (drawing no power) and various in-between levels of power-saving states, annotated as d0-d3. similarly, pcie defines a series of link power states (l-states) that work specifically within the link layer between the 82576 and its upstream pcie port (typically in the host chipset). since the 82576 is a multi-port device, each of its pci functions may be in a different state at any given moment. the device power state is defined by the most active function. for example, if function 0 is in d0 state and all other functions are in d3 state, device state is d0. link state follows the device state. for a given device d-state, only certain l-states are possible as follows. for a given component d-state, only certain l-states are possible as follows. ? d0 (fully on): the 82576 is completely active and responsive during this d-state. the link can be in either l0 or a low-latency idle state referred to as l0s. minimizing l0s exit latency is paramount for enabling frequent entry into l0s while facilitating performance needs via a fast exit. a deeper link power state, l1 state, is supported as well. ? d1 and d2: these modes are not supported by the 82576. ? d3 (off): two sub-states of d3 are supported: ? d3hot, where primary power is maintained. ? d3cold, where primary power is removed. link states are mapped into device states as follows: ? d3hot maps to l1 to support clock removal ? d3cold maps to l2 if auxiliary power is supported on 82576 with wake-capable logic, or to l3 if no power is delivered to 82576. a sideband pe_wake_n mechanism is supported to interface wake-enabled logic on mobile platforms during the l2 state.
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 196 5.1.2 pcie link power states 5.1.3 pcie link power states configuring an 82576 into d-states automatically causes the pcie links to transition to the appropriate l-states. ? l2/l3 ready: this link state prepares the pcie link for the removal of power and clock. the 82576 is in the d3hot state and is preparing to enter d3cold. the power-saving opportunities for this state include, but are not limited to, clock gating of all pcie architecture logic, shutdown of the pll, and shutdown of all transceiver circuitry. ? l2: this link state is intended to comprehend d3cold with auxiliary power support. note that sideband wake# signaling is recommended to cause wake-capable devices to exit this state. the power-saving opportunities for this state include, but are not limited to, shutdown of all transceiver circuitry except detection circuitry to support exit, clock gating of all pcie logic, and shutdown of the pll as well as appropriate platform voltage and clock generators. ? l3 (link off): power and clock are removed in this link state, and there is no auxiliary power available. to bring the 82576 and its link back up, the platform must go through a boot sequence where power, clock, and reset are reapplied appropriately. 5.2 82576 power states the 82576 supports the d0 and d3 architectural power states as described earlier. internally, the 82576 supports the following power states: ? d0u (d0 un-initialized) - an architectural sub-state of d0 ? d0a (d0 active) - an architectural sub-state of d0 ? d3 - architecture state d3hot ? dr - internal state that contains the architecture d3cold state. dr state is entered when pe_rst_n is asserted or a pcie in-band reset is received figure 5-1 shows the power states and transitions between them.
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 197 5.2.1 d0 uninitialized state (d0u) the d0u state is an architectural low-power state. when entering d0u, the 82576: ? asserts a reset to the phy while the eeprom is being read ? disables wake up. however, if the apm mode bit in the eeprom's initialization control word 2 is set, then apm wake up is enabled. 5.2.1.1 entry into d0u state d0u is reached from either the dr state (on de-assertion of pe_rst_n ) or the d3hot state (by configuration software writing a value of 00b to the power state field of the pci pm registers). 5.2.1.2 exit from d0u state de-asserting pe_rst_n means that the entire state of the 82576 is cleared, other than sticky bits. state is loaded from the eeprom, followed by establishment of the pcie link. once this is done, configuration software can access the 82576. figure 5-1. power management state diagram
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 198 on a transition from d3 to d0u state, the 82576 pci configuration space is not reset. however, the 82576 requires that software perform a full re-initialization of the function including its pci configuration space. 5.2.2 d0active state once memory space is enabled, the 82576 enters a d0 active state. it can transmit and receive packets if properly configured by the software device driver. the phy is enabled or re-enabled by the software device driver to operate/auto-negotiate to full line speed/power if not already operating at full capability. any apm wake up previously active remains active. the software device driver can deactivate apm wake up by writing to the wake up control (wuc) register or activate other wake-up filters by writing to the wake up filter control (wufc) register. 5.2.2.1 entry to d0a state d0a is entered from the d0u state by writing a 1b to the memory access enable or the i/o access enable bit of the pci command register. the dma, mac, and phy of the appropriate lan function are also enabled. 5.2.3 d3 state (pci-pm d3hot) the 82576 transitions to d3 when the system writes a 11b to the power state field of the power management control/status register (pmcsr) . any wake-up filter settings that were enabled before entering this state are maintained. upon completion or during the transition to d3 state, the 82576 clears the memory access enable and i/o access enable bits of the pci command register, which disables memory access decode. while in d3, the 82576 does not generate master cycles. configuration and message requests are the only tlps accepted by a function in the d3hot state. all other received requests must be handled as unsupported requests, and all received completions are handled as unexpected completions. if an error caused by a received tlp (such as an unsupported request) is detected while in d3hot, and reporting is enabled, the link must be returned to l0 if it is not already in l0 and an error message must be sent. see section 5.3.1.4.1 in the pcie base specification 5.2.3.1 entry to d3 state transition to d3 state is through a configuration write to the power state field of the pci-pm registers. prior to transition from d0 to the d3 state, the software device driver disables scheduling of further tasks to the 82576; it masks all interrupts and does not write to the transmit descriptor tail (tdt) register or to the receive descriptor tail (rdt) register and operates the master disable algorithm as defined in section 5.2.3.2 . if wake up capability is needed, system should enable wake capability by setting to 1b the pme_en bit in the power management control / status register (pmcsr) . after wake capability has been enabled software device driver should set up the appropriate wake up registers prior to the d3 transition. note: if operation during d3 cold is required, even when wake capability is not required (e.g. for manageability operation), system should also set the auxiliary (aux) power pm enable bit in the pcie device control register .
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 199 as a response to being programmed into d3 state, the 82576 transitions its pcie link into the l1 link state. as part of the transition into l1 state, the 82576 suspends scheduling of new tlps and waits for the completion of all previous tlps it has sent. the 82576 clears the memory access enable and i/o access enable bits of the pci command register, which disables memory access decode. any receive packets that have not been transferred into system memory are kept in the 82576 (and discarded later on d3 exit). any transmit packets that have not be sent can still be transmitted (assuming the ethernet link is up). in order to reduce power consumption, if the link is still needed for manageability or wake-up functionality, the phy auto-negotiates to a lower link speed on d3 entry (see section 3.5.7.6.4 ). 5.2.3.2 exit from d3 state a d3 state is followed by either a d0u state (in preparation for a d0a state) or by a transition to dr state (pci-pm d3cold state). to transition back to d0u, the system writes a 00b to the power state field of the power management control/status register ( pmcsr ). transition to dr state is through pe_rst_n assertion. the 82576 always sets the no_soft_reset bit in the pcie power management control / status register ( pmcsr ) to 0b to indicate that barton hills performs an internal reset on transition from d3hot to d0. configuration context is lost when performing the soft reset. after transition from the d3hot to the d0 state, full re-initialization sequence is needed to return barton hills to d0 initialized. 5.2.3.3 master disable via ctrl register system software can disable master accesses on the pcie link by either clearing the pci bus master bit or by bringing the function into a d3 state. from that time on, the 82576 must not issue master accesses for this function. due to the full-duplex nature of pcie, and the pipelined design in the 82576, it might happen that multiple requests from several functions are pending when the master disable request arrives. the protocol described in this section insures that a function does not issue master requests to the pcie link after its master enable bit is cleared (or after entry to d3 state). two configuration bits are provided for the handshake between the 82576 function and its software device driver: ? gio master disable bit in the device control (ctrl) register - when the gio master disable bit is set, the 82576 blocks new master requests by this function. the 82576 then proceeds to issue any pending requests by this function. this bit is cleared on master reset (internal_power_on_reset to software reset) to enable master accesses. ? gio master enable status bits in the device status register - cleared by the 82576 when the gio master disable bit is set and no master requests are pending by the relevant function. set otherwise. indicates that no master requests are issued by this function as long as the gio master disable bit is set. the following activities must end before the 82576 clears the gio master enable status bit: ? master requests by the transmit and receive engines ? all pending completions to the 82576 are received. note: the software device driver sets the gio master disable bit when notified of a pending master disable (or d3 entry). the 82576 then blocks new requests and proceeds to issue any pending requests by this function. the software device driver then polls the gio master enable status bit. once the bit is cleared, it is guaranteed that no requests are pending from this function. the software device driver might time out if the gio master enable status bit is not cleared within a given time.
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 200 the gio master disable bit must be cleared to enable a master request to the pcie link. this can be done either through reset or by the software device driver. 5.2.4 dr state (d3cold) transition to dr state is initiated on several occasions: ? on system power up - dr state begins with the assertion of the internal power detection circuit ( pe_rst_n ) and ends with de-assertion of pe_rst_n . ? on transition from a d0a state - during operation the system might assert pe_rst_n at any time. in an acpi system, a system transition to the g2/s5 state causes a transition from d0a to dr state. ? on transition from a d3 state - the system transitions the 82576 into the dr state by asserting pcie pe_rst_n . any wake-up filter settings that were enabled before entering this reset state are maintained. the system might maintain pe_rst_n asserted for an arbitrary time. the de-assertion (rising edge) of pe_rst_n causes a transition to d0u state. while in dr state, the 82576 might enter one of several modes with different levels of functionality and power consumption. the lower-power modes are achieved when the 82576 is not required to maintain any functionality (see section 5.2.4.1 ). note: if the 82576 is configured to provide a 50 mhz nc-si clock (via the nc-si output clock eeprom bit), then the nc-si clock must be provided in dr state as well. 5.2.4.1 dr disable mode the 82576 enters a dr disable mode on transition to d3cold state when it does not need to maintain any functionality. the conditions to enter either state are: ? the 82576 (all pci functions) is in dr state ? apm wol is inactive for both lan functions ? pass-through manageability is disabled ? acpi pme is disabled for all pci functions ? the 82576 power down en eeprom bit is set (word 0x1e, bit 15) is set (default hardware value is disabled). ? default hardware value is disabled). ? the phy power down enable eeprom bit is set (word 0xf, bit 6). entering dr disable mode is usually done by asserting pcie pe_rst_n . it might also be possible to enter dr disable mode by reading the eeprom while already in dr state. the usage model for this later case is on system power up, assuming that manageability and wake up are not required. once the 82576 enters dr state on power-up, the eeprom is read. if the eeprom contents determine that the conditions to enter dr disable mode are met, the 82576 then enters this mode (assuming that pcie pe_rst_n is still asserted). note: the 82576 exits dr disable mode when dr state is exited (see figure 5-1 for conditions to exit dr state).
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 201 5.2.4.2 entry to dr state dr entry on platform power-up begins with the assertion of the internal power detection circuit ( pe_rst_n ). the eeprom is read and determines 82576 configuration. if the apm enable bit in the eeprom's initialization control word 3 is set, then apm wake up is enabled. phy and mac states are determined by the state of manageability and apm wake. to reduce power consumption, if manageability or apm wake is enabled, the phy auto-negotiates to a lower link speed on dr entry (see section 3.5.7.6.4 ). the pcie link is not enabled in dr state following system power up (since pe_rst_n is asserted). entering dr state from d0a state is done by asserting pe_rst_n . an acpi transition to the g2/s5 state is reflected in an 82576 transition from d0a to dr state. the transition can be orderly (such as, user selected the shut down option), in which case the software device driver might have a chance to intervene. or, it might be an emergency transition (such as power button override), in which case, the software device driver is not notified. to reduce power consumption, if any of manageability, apm wake or pci-pm pme 1 is enabled, the phy auto-negotiates to a lower link speed on d0a to dr transition (see section 3.5.7.6.4 ). transition from d3(hot) state to dr state is done by asserting pe_rst_n. prior to that, the system initiates a transition of the pcie link from l1 state to either the l2 or l3 state (assuming all functions were already in d3 state). the link enters l2 state if pci-pm pme is enabled. 5.2.4.3 auxiliary power usage the eeprom d3cold_wakeup_adven bit and the aux_pwr strapping pin determine when d3cold pme is supported: ? d3cold_wakeup_adven denotes that pme wake should be supported ? aux_pwr strapping pin indicates that auxiliary power is provided d3cold pme is supported as follows: ? if the d3cold_wakeup_adven is set to ?1? and the aux_pwr strapping is set to ?1?, then d3cold pme is supported ? else d3cold pme is not supported the amount of power required for the function (including the entire nic) is advertised in the power management data register, which is loaded from the eeprom. if d3cold is supported, the pme_en and pme_status bits of the power management control/status register (pmcsr), as well as their shadow bits in the wake up control (wuc) register are reset only by the power up reset (detection of power rising). 5.2.5 link disconnect in any of d0u, d0a, d3, or dr power states, the 82576 enters a link-disconnect state if it detects a link- disconnect condition on the ethernet link. note that the link-disconnect state is invisible to software (other than the link energy detect bit state). in particular, while in d0 state, software might be able to access any of the 82576 registers as in a link-connect state. 1. acpi 2.0 specifies that ospm will not disable wake events before setting the slp_en bit when entering the s5 sleeping state. this provides support for remote management initiatives by enabling remote power on (rpo) capability. this is a change from acpi 1.0 behavior .
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 202 5.2.6 device power-down state the 82576 enters a global power-down state if all of the following conditions are met: ? the 82576 power down enable eeprom bit (word 0x1e bit 15) was set (default hardware value is disabled). ? the 82576 is in dr state. ? the link connections of both ports (phy or serdes) are in power down mode. the 82576 also enters a power-down state when the dev_off_n pin is active and the relevant eeprom bits were configured as previously described (see section 4.4 for more details on dev_off_n functionality). 5.3 power limits by certain form factors the 82576 exceeds the allocated auxiliary power in some configurations (such as both ports running at 1000 mb/s speed). the 82576 must therefore be configured to meet requirements. to do so, the 82576 implements three eeprom bits to disable operation in certain cases: 1. the disable_1000 phy register bit disables 1000 mb/s operation under all conditions. 2. the disable 1000 in non-d0a phy csr bit disables 1000 mb/s operation in non-d0a states 1 . if disable 1000 in non-d0a is set, and the 82576 is at 1000 mb/s speed on entry to a non-d0a state, then the 82576 removes advertisement for 1000 mb/s and auto-negotiates. note that the 82576 restarts link auto-negotiation each time it transitions from a state where 1000 mb/ s or 100 mb/s speed is enabled to a state where 1000 mb/s or 100 mb/s speed is disabled, or vice versa. for example, if disable 1000 in non-d0a is set but disable_1000 is cleared, the 82576 restarts link auto-negotiation on transition from d0 state to d3 or dr states. 5.4 interconnects power management this section describes the power reduction techniques employed by the 82576 main interconnects. 5.4.1 pcie link power management the pcie link state follows the power management state of the 82576. since the 82576 incorporates multiple pci functions, its power management state is defined as the power management state of the most awake function (see figure 5-2 ): ? if any function is in d0 state (either d0a or d0u), the pcie link assumes the 82576 is in d0 state. else, ? if the functions are in d3 state, the pcie link assumes the 82576 is in d3 state. else, ? the 82576 is in dr state (pe_rst_n is asserted to all functions). the 82576 supports all pcie power management link states: ? l0 state is used in d0u and d0a states. 1. the restriction is defined for all non-d0a states to have compatible behavior with previous products.
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 203 ? the l0s state is used in d0a and d0u states each time link conditions apply. ? the l1 state is also used in d0a and d0u states when idle conditions apply for a longer period of time. the l1 state is also used in the d3 state. ? the l2 state is used in the dr state following a transition from a d3 state if pci-pm pme is enabled. ? the l3 state is used in the dr state following power up, on transition from d0a, and if pme is not enabled in other dr transitions. the 82576?s support for active state link power management is reported via the pcie active state link pm support register and is loaded from the eeprom. while in l0 state, the 82576 transitions the transmit lane(s) into l0s state once the idle conditions are met for a period of time as follows: l0s configuration fields are: ? l0s enable - the default value of the active state link pm control field in the pcie link control register is set to 00b (both l0s and l1 disabled). system software might later write a different value into the pcie link control register. the default value is loaded on any reset of the pci configuration registers. ? the l0s_entry_lat bit in the pcie control register (gcr), determines l0s entry latency. when set to 0b, l0s entry latency is the same as l0s exit latency of the 82576 at the other end of the link. when set to 1b, l0s entry latency is 1/4 of the l0s exit latency of the 82576 at the other end figure 5-2. link power management state diagram
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 204 of the link. the default value is 0b (entry latency is the same as l0s exit latency of the 82576 at the other end of the link). ? l0s exit latency (as published in the l0s exit latenc y field of the link capabilities register) is loaded from eeprom. separate values are loaded when the 82576 shares the same reference pcie clock with its partner across the link, and when the 82576 uses a different reference clock than its partner across the link. the 82576 reports whether it uses the slot clock configuration, through the pcie slot clock configuration bit loaded from the slot_clock_cfg eeprom bit. ? l0s acceptable latency (as published in the endpoint l0s acceptable latency field of the device capabilities register) is loaded from eeprom. l1 configuration fields are: ? l1 entry latency ? the 82576 enters the l1 state after it has been in the l0s state (in both directions) for a period of time determined by the latency_to_enter_l1 csr register. the initial value is loaded from the latency_to_enter_l1 eeprom field. ? l1 exit latency (as published in the l1 exit latency field of the link capabilities register) is loaded from the l1_act_ext_latency latency_to_enter_l1 field in the eeprom. ? l1 acceptable latency (as published in the endpoint l1 acceptable latency field of the device capabilities register) is loaded from eeprom. 5.4.2 nc-si clock control the 82576 can be configured to provide a 50 mhz output clock to its nc-si interface and other platform devices. when enabled (through the nc-si output clock eeprom bit), the nc-si clock is provided in all power states without exception. 5.4.3 phy power-management the phy power management features are described in section 3.5.7.6 . 5.4.4 serdes/sgmii power management each 82576 serdes enters a power-down state when none of its clients is enabled and therefore has no need to maintain a link. this can happen in one of the following cases. note that serdes power-down must be enabled through the eeprom serdes low power enable bit. 1. d3/dr state: each serdes enters a low-power state if the following conditions are met: a. the lan function associated with this serdes is in a non-d0 state b. apm wol is inactive c. pass-through manageability is disabled d. acpi pme is disabled 2. phy mode: each serdes is disabled when its lan function is configured to phy mode. 3. lan disable: each serdes can be disabled if its lan function's lan disable input indicates that the relevant function should be disabled. since the serdes is shared between the lan function and manageability, it might not be desired to power down the serdes in lan disable. the phy_in_lan_disable eeprom bit determines whether the serdes is powered down when the lan disable pin is asserted. the default is not to power down.
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 205 5.5 timing of power-state transitions the following sections give detailed timing for the state transitions. in the diagrams the dotted connecting lines represent the 82576 requirements, while the solid connecting lines represent the 82576 guarantees. the timing diagrams are not to scale. the clocks edges are shown to indicate running clocks only are not used to indicate the actual number of cycles for any operation. 5.5.1 power up (off to dup to d0u to d0a figure 5-3. power up (off to dup to d0u to d0a) table 5-1. power up (off to dup to d0u to d0a) note description 1 xosc is stable t xog after power is stable. 2 internal_power_on_reset is asserted after all power supplies are good and t ppg after xosc is stable. 3 an eeprom read starts on the rising edge of internal_power_on_reset.
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 206 5.5.2 transition from d0a to d3 and back without pe_rst_n 4 after reading the eeprom, phy reset is de-asserted. 5 apm wake-up mode can be enabled based on what is read from the eeprom. 6 the pcie reference clock is valid t pe_rst-clk before de-asserting pe_rst_n (according to pcie specification). 7 pe_rst_n is de-asserted t pvpgl after power is stable (according to pcie specification). 8 the internal pcie clock is valid and stable t ppg-clkint from pe_rst_n de-assertion. 9 the pcie internal pwrgd signal is asserted t clkpr after the external pe_rst_n signal. 10 asserting internal pcie pwrgd causes the eeprom to be re-read, asserts phy reset, and disables wake up. 11 after reading the eeprom, phy reset is de-asserted. 12 link training starts after t pgtrn from pe_rst_n de-assertion. 13 a first pcie configuration access might arrive after t pgcfg from pe_rst_n de-assertion. 14 a first pci configuration response can be sent after t pgres from pe_rst_n de-assertion. 15 writing a 1b to the memory access enable bit in the pci command register transitions the 82576 from d0u to d0. state. figure 5-4. transition from d0a to d3 and back without pe_rst_n table 5-1. power up (off to dup to d0u to d0a) (continued)
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 207 5.5.3 transition from d0a to d3 and back with pe_rst_n table 5-2. transition from d0a to d3 and back without pe_rst_n note description 1 writing 11b to the power state field of the power management control/status register (pmcsr) transitions the 82576 to d3. 2 the system can keep the 82576 in d3 state for an arbitrary amount of time. 3 to exit d3 state, the system writes 00b to the power state field of the pmcsr. 4 apm wake-up or smbus mode might be enabled based on what is read in the eeprom. 5 after reading the eeprom, reset to the phy is de-asserted. the phy operates at reduced-speed if apm wake up or smbus is enabled, else powered-down. 6 the system can delay an arbitrary time before enabling memory access. 7 writing a 1b to the memory access enable bit or to the i/o access enable bit in the pci command register transitions the 82576 from d0u to d0 state and returns the phy to full-power/speed operation. figure 5-5. transition from d0a to d3 and back with pe_rst_n
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 208 table 5-3. transition from d0a to d3 and back with pe_rst_n note description 1 writing 11b to the power state field of the pmcsr transitions the 82576 to d3. pcie link transitions to l1 state. 2 the system can delay an arbitrary amount of time between setting d3 mode and transitioning the link to an l2 or l3 state. 3 following link transition, pe_rst_n is asserted. 4 the system must assert pe_rst_n before stopping the pcie reference clock. it must also wait t l2clk after link transition to l2/l3 before stopping the reference clock. 5 on assertion of pe_rst_n, the 82576 transitions to dr state. 6 the system starts the pcie reference clock t pe_rst-clk before de-assertion pe_rst_n. 7 the internal pcie clock is valid and stable t ppg-clkint from pe_rst_n de-assertion. 8 the pcie internal pwrgd signal is asserted t clkpr after the external pe_rst_n signal. 9 asserting internal pcie pwrgd causes the eeprom to be re-read, asserts phy reset, and disables wake up. 10 apm wake-up mode might be enabled based on what is read from the eeprom. 11 after reading the eeprom, phy reset is de-asserted. 12 link training starts after t pgtrn from pe_rst_n de-assertion. 13 a first pcie configuration access might arrive after t pgcfg from pe_rst_n de-assertion. 14 a first pci configuration response can be sent after t pgres from pe_rst_n de-assertion. 15 writing a 1b to the memory access enable bit in the pci command register transitions the 82576 from d0u to d0 state.
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 209 5.5.4 transition from d0a to dr and back without transition to d3 figure 5-6. transition from d0a to dr and back without transition to d3 table 5-4. transition from d0a to dr and back without transition to d3 note description 1 the system must assert pe_rst_n before stopping the pcie reference clock. it must also wait t l2clk after link transition to l2/l3 before stopping the reference clock. 2 on assertion of pe_rst_n, the 82576 transitions to dr state and the pcie link transition to electrical idle. 3 the system starts the pcie reference clock t pe_rst-clk before de-assertion pe_rst_n. 4 the internal pcie clock is valid and stable t ppg-clkint from pe_rst_n de-assertion. 5 the pcie internal pwrgd signal is asserted t clkpr after the external pe_rst_n signal. 6 asserting internal pcie pwrgd causes the eeprom to be re-read, asserts phy reset, and disables wake up. 7 apm wake-up mode might be enabled based on what is read from the eeprom. 8 after reading the eeprom, phy reset is de-asserted. 9 link training starts after t pgtrn from pe_rst_n de-assertion.
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 210 5.6 wake up the 82576 supports two modes of wake-up management: 1. advanced power management (apm) wake up 2. acpi/pcie defined wake up the usual model is to activate one mode at a time but not both modes together. if both modes are activated, the 82576 might wake up the system in unexpected events. for example, if apm is enabled together with pcie pme, a magic packet might wake up the system even if apmpme is disabled. alternatively, if apm is enabled together with some pcie filters, packets matching these filters might wake up the system even if pcie pme is disabled. 5.6.1 advanced power management wake up advanced power management wake up or apm wakeup (also known as wake on lan) is a feature that existed in earlier 10/100 mb/s nics. this functionality was designed to receive a broadcast or unicast packet with an explicit data pattern, and then assert a subsequent signal to wake up the system. this was accomplished by using a special signal that ran across a cable to a defined connector on the motherboard. the nic would assert the signal for approximately 50 ms to signal a wake up. the 82576 now uses (if configured) an in-band pm_pme message for this functionality. on power up, the 82576 reads the apm enable bits from the eeprom initialization control word 3 into the apm enable (apme) bits of the wakeup control (wuc) register. these bits control enabling of apm wake up. when apm wake up is enabled, the 82576 checks all incoming packets for magic packets. see section 5.6.3.1.4 for a definition of magic packets. once the 82576 receives a matching magic packet, and if the assert pme on apm wakeup (apmpme) bit is set in the wake up control (wuc) register, it: ? sets the pme_status bit in the pmcsr and issues a pm_pme message (in some cases, this might require asserting the wake# signal first to resume power and clock to the pcie interface). ? stores the first 128 bytes of the packet in the wake up packet memory (wupm) register. ? sets the magic packet received bit in the wake up status (wus) register. ? sets the packet length in the wake up packet length (wupl) register. the 82576 maintains the first magic packet received in the wake up packet memory (wupm) register until the software device driver writes a 1b to the magic packet received mag bit in the wake up status (wus) register. 10 a first pcie configuration access might arrive after t pgcfg from pe_rst_n de-assertion. 11 a first pci configuration response can be sent after t pgres from pe_rst_n de-assertion. 12 writing a 1b to the memory access enable bit in the pci command register transitions the 82576 from d0u to d0 state. table 5-4. transition from d0a to dr and back without transition to d3 (continued)
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 211 apm wake up is supported in all power states and only disabled if a subsequent eeprom read results in the apm wake up bit being cleared or software explicitly writes a 0b to the apm wake up (apm) bit of the wuc register. 5.6.2 pcie power management wake up the 82576 supports pcie power management based wake ups. it can generate system wake-up events from three sources: ? reception of a magic packet. ? reception of a network wakeup packet. ? detection of a link change of state. activating pcie power management wake up requires the following: ? the software device driver programs the wake up filter control (wufc) register to indicate the packets it needs to wake up and supplies the necessary data to the ipv4/v6 address table (ip4at, ip6at) and the flexible host filter table (fhft). it can also set the link status change wake up enable (lnkc) bit in the wake up filter control (wufc) register to cause wake up when the link changes state. ? the operating system (at configuration time) writes a 1b to the pme_en bit of the power management control/status (pmcsr.8) register. normally, after enabling wake up, the operating system write a 11b to the lower two bits of the pmcsr to put the 82576 into low-power mode. once wake up is enabled, the 82576 monitors incoming packets, first filtering them according to its standard address filtering method, then filtering them with all of the enabled wakeup filters. if a packet passes both the standard address filtering and at least one of the enabled wakeup filters, the 82576: ? sets the pme_status bit in the pmcsr. ? asserts pe_wake_n (if the pme_en bit in the pmcsr is set). ? stores the first 128 bytes of the packet in the wakeup packet memory (wpm) register. ? sets one or more of the received bits in the wake up status (wus) register. note that the 82576 sets more than one bit if a packet matches more than one filter. ? sets the packet length in the wake up packet length (wupl) register. if enabled, a link state change wake up causes similar results, setting pme_status , asserting pe_wake_n and setting the link status changed (lnkc) bit in the wake up status (wus) register when the link goes up or down. the 82576 supports the following change described in the pcie base specification, rev. 1.1rd (section 5.3.3.4) - on receiving a pme_turn_off message, the 82576 must block the transmission of pm_pme messages and transmit a pme_to_ack message upstream. the 82576 is permitted to send a pm_pme message after the link is returned to an l0 state through ldn. pe_wake_n remains asserted until the operating system either writes a 1b to the pme_status bit of the pmcsr register or writes a 0b to the pme_en bit. after receiving a wake-up packet, the 82576 ignores any subsequent wake-up packets until the software device driver clears all of the received bits in the wake up status (wus) register. it also ignores link change events until the software device driver clears the link status changed (lnkc) bit in the wake up status (wus) register.
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 212 note: a wake on link change is not supported when configured to serdes mode. 5.6.3 wake-up packets the 82576 supports various wake-up packets using two types of filters: ? pre-defined filters ? flexible filters each of these filters are enabled if the corresponding bit in the wake up filter control (wufc) register is set to 1b. 5.6.3.1 pre-defined filters the following packets are supported by the 82576's pre-defined filters: ? directed packet (including exact, multicast indexed, and broadcast) ? magic packet ? arp/ipv4 request packet ? directed ipv4 packet ? directed ipv6 packet each of these filters are enabled if the corresponding bit in the wakeup filter control (wufc) register is set to 1b. the explanation of each filter includes a table showing which bytes at which offsets are compared to determine if the packet passes the filter. note: both vlan frames and llc/snap can increase the given offsets if they are present. 5.6.3.1.1 directed exact packet the 82576 generates a wake-up event after receiving any packet whose destination address matches one of the 24 valid programmed receive addresses, if the directed exact wake up enable bit is set in the wake up filter control (wufc.ex) register. 5.6.3.1.2 directed multicast packet for multicast packets, the upper bits of the incoming packet's destination address index a bit vector, the multicast table array (mta) that indicates whether to accept the packet. if the directed multicast wake up enable bit set in the wake up filter control (wufc.mc) register and the indexed bit in the vector is one, then the 82576 generates a wake-up event. the exact bits used in the comparison are programmed by software in the multicast offset field of the receive control (rctl.mo) register. 5.6.3.1.3 broadcast offset # of bytes field value action comment 0 6 destination address compare see section 5.6.3.1.2 .
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 213 if the broadcast wake up enable bit in the wake up filter control (wufc.bc) register is set, the 82576 generates a wake-up event when it receives a broadcast packet . 5.6.3.1.4 magic packet magic packets are defined in: http://www.amd.com/us-en/assets/content_type/white_papers_and_tech_docs/20213.pdf as: ?once the lan controller has been put into the magic packet mode, it scans all incoming frames addressed to the node for a specific data sequence.this sequence indicates to the controller that this is a magic packet frame. a magic packet frame must also meet the basic requirements for the lan technology chosen, such as source address, destination address (which may be the receiving station's ieee address or a multicast address which includes the broadcast address), and crc. the specific data sequence consists of 16 repetitions of the ieee address of this node, with no breaks or interruptions. this sequence can be located anywhere within the packet, but must be preceded by a synchronization stream. the synchronization stream allows the scanning state machine to be much simpler. the synchronization stream is defined as 6 bytes of 0xff. the device will also accept a broadcast frame, as long as the 16 repetitions of the ieee address match the address of the machine to be awakened.? the 82576 expects the destination address to either: ? be the broadcast address (ff.ff.ff.ff.ff.ff) ? match the value in receive address 0 (rah0, ral0) register. this is initially loaded from the eeprom but can be changed by the software device driver. ? match any other address filtering enabled by the software device driver. the 82576 searches for the contents of receive address 0 (rah0, ral0) register as the embedded ieee address. it considers any non-0xff byte after a series of at least 6 0xffs to be the start of the ieee address for comparison purposes. for example, it catches the case of 7 0xffs followed by the ieee address). as soon as one of the first 96 bytes after a string of 0xffs don't match, it continues to search for anther set of at least 6 0xffs followed by the 16 copies of the ieee address later in the packet. note that this definition precludes the first byte of the destination address from being ff. a magic packet's destination address must match the address filtering enabled in the configuration registers with the exception that broadcast packets are considered to match even if the broadcast accept bit of the receive control (rctl.bam) register is 0b. if apm wake up (wake up by a magic packet) is enabled in the eeprom, the 82576 starts up with the receive address 0 (rah0, ral0) register loaded from the eeprom. this enables the 82576 to accept packets with the matching ieee address before the software device driver loads. offset # of bytes field value action comment 0 6 destination address ff*6 compare
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 214 note: accepting broadcast magic packets for wake up purposes when the broadcast accept bit of the receive control (rctl.bam) register is 0b is a change from a previous device, which initialized rctl.bam to 1 if apm was enabled in the eeprom, but then required that bit to be 1b to accept broadcast magic packets, unless broadcast packets passed another perfect or multicast filter. 5.6.3.1.5 arp/ipv4 request packet the 82576 supports receiving arp request packets for wake up if the arp bit is set in the wake up filter control (wufc) register. four ipv4 addresses are supported, which are programmed in the ipv4 address table (ip4at). a successfully matched packet must contain a broadcast mac address, a protocol type of 0x0806, an arp op-code of 0x01, and one of the four programmed ipv4 addresses. the 82576 also handles arp request packets that have vlan tagging on both ethernet ii and ethernet snap types. table 5-5. magic packet structure offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter. 6 6 source address skip 12 s=(0/4) possible vlan tag skip 12 + s d=(0/8) possible length + llc/ snap header skip 12 + s + d 2 type skip any 6 synchronizing stream ff*6+ compare any+6 96 16 copies of node address a*16 compare compared to receive address 0 (rah0, ral0) register. table 5-6. arp packet structure and processing offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter. 6 6 source address skip 12 s=(0/4) possible vlan tag compare processed by main address filter. 12 + s d=(0/8) possible length + llc/ snap header skip 12 + s + d 2 ethernet type 0x0806 compare arp 14 + s + d 2 hw type 0x0001 compare 16 + s + d 2 protocol type 0x0800 compare 18 + s + d 1 hardware size 0x06 compare
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 215 5.6.3.1.6 directed ipv4 packet the 82576 supports receiving directed ipv4 packets for wake up if the ipv4 bit is set in the wake up filter control (wufc) register. four ipv4 addresses are supported, which are programmed in the ipv4 address table (ip4at). a successfully matched packet must contain the station's mac address, a protocol type of 0x0800, and one of the four programmed ipv4 addresses. the 82576 also handles directed ipv4 packets that have vlan tagging on both ethernet ii and ethernet snap types. 19 + s + d 1 protocol address length 0x04 compare 20 + s + d 2 operation 0x0001 compare 22 + s + d 6 sender hw address - ignore 28 + s + d 4 sender ip address - ignore 32 + s + d 6 target hw address - ignore 38 + s + d 4 target ip address ip4at compare compare if the directed arp bit is set to 1b. may match any of four values in ip4at . table 5-7. ipv4 packet structure and processing offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter. 6 6 source address skip 12 s=(0/4) possible vlan tag compare processed by main address filter. 12 + s d=(0/8) possible length + llc/ snap header skip 12 + s + d 2 ethernet type 0x0800 compare ipv4 14 + s + d 1 version/ hdr length 0x4x compare check ipv4 15 + s + d 1 type of service - ignore 16 + s + d 2 packet length - ignore 18 + s + d 2 identification - ignore 20 + s + d 2 fragment info - ignore 22 + s + d 1 time to live - ignore table 5-6. arp packet structure and processing
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 216 5.6.3.1.7 directed ipv6 packet the 82576 supports receiving directed ipv6 packets for wake up if the ipv6 bit is set in the wake up filter control (wufc) register. one ipv6 address is supported and is programmed in the ipv6 address table (ip6at). a successfully matched packet must contain the station's mac address, a protocol type of 0x86dd, and the programmed ipv6 address. in addition, the ipav.v60 bit should be set. the 82576 also handles directed ipv6 packets that have vlan tagging on both ethernet ii and ethernet snap types. 5.6.3.2 flexible filters the 82576 supports a total of six flexible filters. each filter can be configured to recognize an arbitrary pattern within the first 128 bytes of the packet. to configure the flexible filters, software programs the mask values (required values and the minimum packet length), into the flexible host filter table (fhft and fhft_ext). these six flexible filters contain separate values for each filter. software must also 23 + s + d 1 protocol - ignore 24 + s + d 2 header checksum - ignore 26 + s + d 4 source ip address - ignore 30 + s + d 4 destination ip address ip4at compare may match any of four values in ip4at . table 5-8. ipv6 packet structure and processing offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter. 6 6 source address skip 12 s=(0/4) possible vlan tag compare processed by main address filter. 12+ s d=(0/8) possible length + llc/snap header skip 12 + s + d 2 ethernet type 0x86dd compare ipv6 14 + s + d 1 version/ priority 0x6x compare check ipv6 15 + s + d 3 flow label - ignore 18 + s + d 2 payload length - ignore 20 + s + d 1 next header - ignore 21 + s + d 1 hop limit - ignore 22 + s + d 16 source ip address - ignore 38 + s + d 16 destination ip address ip6at compare match value in ip6at . table 5-7. ipv4 packet structure and processing (continued)
power management ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 217 enable the filters in the wake up filter control (wufc) register, and enable the overall wake up functionality. the overall wake up functionality must be enabled by setting pme_en in the pmcsr or the wake up control (wuc) register. once enabled, the flexible filters scan incoming packets for a match. if the filter encounters any byte in the packet where the mask bit is one and the byte doesn't match the value programmed in the flexible host filter table (fhft), then the filter fails that packet. if the filter reaches the required length without failing the packet, it passes the packet and generates a wake-up event. it ignores any mask bits set to one beyond the required length. note: the flex filters are temporarily disabled when read from or written to by the host. any packet received during a read or write operation is dropped. filter operation resumes once the read or write access completes. the following packets are listed for reference purposes only. the flexible filter could be used to filter these packets. 5.6.3.2.1 ipx diagnostic responder request packet an ipx diagnostic responder request packet must contain a valid mac address, a protocol type of 0x8137, and an ipx diagnostic socket of 0x0456. it might include llc/snap headers and vlan tags. since filtering this packet relies on the flexible filters, which use offsets specified by the operating system directly, the operating system must account for the extra offset llc/snap headers and vlan tags. 5.6.3.2.2 directed ipx packet a valid directed ipx packet contains the station's mac address, a protocol type of 0x8137, and an ipx node address that is equal to the station's mac address. it might include llc/snap headers and vlan tags. since filtering this packet relies on the flexible filters, which use offsets specified by the operating system directly, the operating system must account for the extra offset llc/snap headers and vlan tags. table 5-9. ipx diagnostic responder request packet structure and processing offset # of bytes field value action comment 0 6 destination address compare 6 6 source address skip 12 s=(0/4) possible vlan tag skip 12+ s d=(0/8) possible length + llc/ snap header skip 12 + s + d 2 ethernet type 0x8137 compare ipx 14 + s + d 16 some ipx stuff - ignore 30 + s + d 2 ipx diagnostic socket 0x0456 compare
intel ? 82576eb gbe controller ? power management intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 218 5.6.3.2.3 ipv6 neighbor discovery filter in ipv6, a neighbor discovery packet is used for address resolution. a flexible filter can be used to check for a neighborhood discovery packet. 5.6.3.3 wake up packet storage the 82576 saves the first 128 bytes of the wake-up packet in its internal buffer, which can be read through the wake up packet memory (wupm) register after the system wakes up. table 5-10. ipx packet structure and processing offset # of bytes field value action comment 0 6 destination address compare mac header ? processed by main address filter. 6 6 source address skip 12 s=(0/4) possible vlan tag skip 12+ s d=(0/8) possible length + llc/snap header skip 12 + s + d 2 ethernet type 0x8137 compare ipx 14 + s + d 10 some ipx info - ignore 24 + s + d 6 ipx node address receive address 0 compare must match receive address 0.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 219 6.0 non-volatile memory map - eeprom 6.1 eeprom general map table 6-1 lists the 82576 eeprom map. table 6-1. eeprom map # used by high byte low byte lan 00 hw section 6.2.1, ethernet address (words 0x00:02) both 01 hw 02 hw 03 sw section 6.10.1, compatibility (word 0x03) both 04 sw section 6.10.2, oem specific (word 0x04) both 05 sw section 6.10.4, eeprom image revision (word 0x05) both 06:07 sw section 6.10.3, oem specific (word 0x06, 0x07) both 08:09 sw section 6.10.5, pba number module (word 0x08, 0x09) 0a hw section 6.2.2, initialization control word 1 (word 0x0a) both 0b hw section 6.2.3, subsystem id (word 0x0b) both 0c hw section 6.2.4, subsystem vendor id (word 0x0c) both 0d hw section 6.2.5, device id (word 0x0d, 0x11) lan0 0e hw reserved 0f hw section 6.2.7, initialization control word 2 lan1 (word 0x0f) both 10 hw section 6.2.8, software defined pins control lan1 (word 0x10) lan1 11 hw section 6.2.5, device id (word 0x0d, 0x11) lan1 12 hw section 6.2.10, eeprom sizing and protected fields (word 0x12) both 13 hw reserved 14 hw section 6.2.12, initialization control 3 (word 0x14, 0x24) lan1 15 hw section 6.2.13, pcie completion timeout configuration (word 0x15) both 16 hw section 6.2.14, msi-x configuration (word 0x16) both 17 fw section 6.3.1, analog configuration pointers start address (offset 0x17) both 18 hw section 6.2.15, pcie init configuration 1 word (word 0x18) both
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 220 19 hw section 6.2.16, pcie init configuration 2 word (word 0x19) both 1a hw section 6.2.17, pcie init configuration 3 word (word 0x1a) both 1b hw section 6.2.18, pcie control (word 0x1b) both 1c hw section 6.2.19, led 1,3 configuration defaults (word 0x1c, 0x2a) lan 0 1d hw section 6.2.6, dummy device id (word 0x1d) both 1e hw section 6.2.20, device rev id (word 0x1e) both 1f fw section 6.2.21, led 0,2 configuration defaults (word 0x1f, 0x2b) lan 0 20 hw section 6.2.9, software defined pins control lan0 (word 0x20) lan 0 21 hw section 6.2.22, functions control (word 0x21) both 22 hw section 6.2.23, lan power consumption (word 0x22) both 23 hw section 6.5.9, management hw config control (word 0x23) both 24 hw section 6.2.12, initialization control 3 (word 0x14, 0x24) lan 0 25 hw section 6.2.24, i/o virtualization (iov) control (word 0x25) both 26 hw section 6.2.25, iov device id (word 0x26) both 27 hw reserved 28 hw reserved 29 hw reserved 2a hw section 6.2.19, led 1,3 configuration defaults (word 0x1c, 0x2a) lan 1 2b hw section 6.2.21, led 0,2 configuration defaults (word 0x1f, 0x2b) lan 1 2c hw section 6.2.26, end of read-only (ro) area (word 0x2c) both 2d hw section 6.2.27, start of ro area (word 0x2d) both 2e hw section 6.2.28, watchdog configuration (word 0x2e) both 2f oem section 6.2.29, vpd pointer (word 0x2f) 30 pxe section 6.10.6.1, main setup options pci function 0 (word 0x30) 31 pxe section 6.10.6.2, configuration customization options pci function 0 (word 0x31) 32 pxe section 6.10.6.3, pxe version (word 0x32) 33 pxe section 6.10.6.4, iba capabilities (word 0x33) 34 pxe section 6.10.6.5, setup options pci function 1 (word 0x34) 35 pxe section 6.10.6.6, configuration customization options pci function 1 (word 0x35) 36 hw section 6.10.6.7, iscsi option rom version (word 0x36) 38 pxe section 6.10.6.8, setup options pci function 2 (word 0x38) 39 pxe section 6.10.6.9, configuration customization options pci function 2 (word 0x39) 3a pxe section 6.10.6.10, setup options pci function 3 (word 0x3a) table 6-1. eeprom map (continued)
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 221 6.2 hardware accessed words this section describes the eeprom words that are loaded by 82576 hardware. most of these bits are located in configuration registers. the words are only read and used if the signature field in the eeprom sizing & protected fields (word 0x12) is valid. note: there are two values given for many locations. one value is the hardware default (no eeprom present). the other is an example of a value loaded from eeprom (the values used are from the 82576_dev_start_no_mgmt_copper_a1 image). depending on the image loaded, the value you see may be different. the eeprom values provided are illustrations. pointers and inactive areas have not been transcribed. 6.2.1 ethernet address (words 0x00:02) the ethernet individual address (ia) is a 6-byte field that must be unique for each nic, and thus unique for each copy of the eeprom image. the first three bytes are vendor specific. for example, the ia is equal to [00 aa 00] or [00 a0 c9] for intel products. the value from this field is loaded into the receive address register 0 (ral0/rah0). for the purpose of this specification, the ia byte numbering convention is indicated as follows: the ethernet address is loaded for lan0 and bit 41 (8th msb) is inverted for lan1 (bit 0 byte 6 in the eeprom = bit 8 in eeprom word 0x2). 3b pxe section 6.10.6.11, configuration customization options pci function 3 (word 0x3b) 3d iscsi section 6.10.8, alternate mac address pointer (word 0x37) 3f pxe section 6.10.9, checksum word (word 0x3f) 40 hw section 6.2.30, nc-si arbitration enable (word 0x40) 41 hw reserved 42 sw section 6.10.10, image unique id (word 0x42, 0x43) 43 sw section 6.10.10, image unique id (word 0x42, 0x43) 44:4f hw reserved 50:5xx fw section 6.5, firmware pointers & control words mng ia byte / value vendor 1 2 3 4 5 6 intel original 00 aa 00 variable variable variable intel new 00 a0 c9 variable variable variable table 6-1. eeprom map (continued)
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 222 6.2.2 initialization control word 1 (word 0x0a) the first word read by the 82576 contains initialization values that: ? set defaults for some internal registers ? enable/disable specific features ? determine which pci configuration space values are loaded from the eeprom bit name hardware default loaded from eeprom: 1 0x046b description 15:1 3 reserved 0b 000b reserved - must be zero. 12 reserved 0b 0b reserved - must be zero. 11 frcspd 0b 0b default setting for the force speed bit in the device control register (ctrl[11]). see section 8.2.1, device control register - ctrl (0x00000; r/w). 10 fd 0b 1b default setting for duplex setting. mapped to ctrl[0]. see section 8.2.1, device control register - ctrl (0x00000; r/ w). 9 reserved 1b 0b reserved - should be set to zero. 8:7 reserved 0b 0b reserved - must be zero 6 sdp_iddq_ en 0b 1b when set, sdp keeps their value and direction when the 82576 enters dynamic iddq mode. otherwise, sdp moves to highz + pull-up mode in dynamic iddq mode. reflected in eediag (see section 8.4.5, eeprom diagnostic - eediag (0x01038; ro)) . 5 deadlock timeout enable 1b 1b if set, a device granted access to the eeprom or flash that does not toggle the interface for more than 1 second might have the grant revoked. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 4 ilos 0b 0b default setting for the loss-of-signal polarity setting for ctrl[7]. section 8.2.1, device control register - ctrl (0x00000; r/w) . 3 power management 1b 1b reserved - must be one. 1b = full support for power management (for normal operation, this bit must be set to 1b). must be one for normal power management operation. see section 9.5.1, pci power management registers . 2 reserved 0b 0b reserved - must be zero. 1 load subsystem ids 1b 1b this bit, when set to 1b, indicates that the 82576 is to load its pcie subsystem id and subsystem vendor id from the eeprom (words 0x0b, 0x0c). 0 load vendor/ device ids 1b 1b this bit, when set to 1b, indicates that the 82576 is to load its pcie device ids from the eeprom (words 0x0d, 0x11, 0x1d, 0x26). 1. example eeprom values are from the 82576_dev_start_no_mgmt_copper_a1 image. as there are numerous images, your values may differ.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 223 6.2.3 subsystem id (word 0x0b) if the load subsystem ids in word 0x0a is set, this word is read in to initialize the subsystem id. see see section 9.4.14 . ? hardware default: 0x0000; loaded from sample eeprom: 0x0000. 6.2.4 subsystem vendor id (word 0x0c) if the load subsystem ids in word 0x0a is set, this word is read in to initialize the subsystem vendor id. the default value is 0x8086. see section 9.4.13, subsystem vendor id register (0x2c; ro) . ? hardware default 0x8086; loaded from sample eeprom 0x0000. 6.2.5 device id (word 0x0d, 0x11) if the load subsystem ids in word 0x0a is set, this word is read in to initialize the device id of lan0, and lan1 functions, respectively. the default value is 10c9. see section 9.4.3, command register (0x4; r/w) . ? hardware default: 0x10c9; loaded from sample eeprom: 0x10c9. 6.2.6 dummy device id (word 0x1d) if the load subsystem ids in word 0x0a is set, this word is read in to initialize the device id of dummy devices. the default value is 0x10a6. see section 9.4.1, vendor id register (0x0; ro) . ? hardware default: 0x10a6; loaded from sample eeprom:0x10a6. 6.2.7 initialization control word 2 lan1 (word 0x0f) this is the second word read by the 82576 and contains additional initialization values that: ? set defaults for some internal registers ? enable/disable specific features bit name hardware default loaded from eeprom: 0xf14b description 15 apm pme# enable 0b 1b initial value of the assert pme on apm wakeup bit in the wake up control (wuc.apmpme) register. see section 8.20.1, wakeup control register - wuc (0x05800; r/w) . 14 pcs parallel detect 1b 1b enables pcs parallel detect. mapped to pcs_lctl.an timeout en bit. see section 8.18.2, pcs link control - pcs_lctl (0x04208; rw) 13:12 pause capability 11b 11b desired pause capability for advertised configuration base page. mapped to pcs_anadv.asm. see section 8.18.4, an advertisement - pcs_anadv (0x04218; r/w) . 11 ane 0b 0b auto-negotiation enable. mapped to pcs_lctl.an_enable. see section 8.18.2, pcs link control - pcs_lctl (0x04208; rw) .
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 224 6.2.8 software defined pins control lan1 (word 0x10) this word is used to configure initial settings for the software definable pins (sdps) for lan1. 10:8 flash size indication 000b 001b indicates flash size according to the following equation: ? size = 64 kb * 2**( flash size indication field). from 64 kb up to 8 mb in powers of 2. the flash size impacts the requested memory space for the flash and expansion rom bars in pcie configuration space. 7 dma clock gating enabled 1b 0b enables automatic reduction of dma and mac frequency. mapped to status[31]. this bit is relevant only if the l1 indication enable bit is set. see section 8.2.2, device status register - status (0x00008; r) . 6 phy power down enable 1b 1b when set, enables the phy to enter a low-power state. this bit is mapped to ctrl_ext[20]. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/ w) . 5 reserved 0b 0b reserved - must be zero. 4 ccm pll shutdown enable 0b 0b when set, enables shutting down the ccm pll in low-power states when the phy is powered down (such as link disconnect). when cleared, the ccm pll is not shut down in a low-power state. reflected in eediag (see section 8.4.5, eeprom diagnostic - eediag (0x01038; ro)) . 3 l1 indication enable 0b 1b when set, enables idle indication to l1 mechanism. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) 2 serdes low power enable 0b 0b when set, enables the serdes to enter a low power state when the function is in dr state. see chapter 5.0 and section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 1 spd enable 1b 1b smart power down. when set, enables phy smart power down mode. see section 3.5.7.6.5, smart power-down (spd) . this bit is loaded to each of the phys, only when the lan1_oem_dis and lan0_oem_dis bits (word 0x23 bits 8:7) are cleared. 0 lplu 1b 1b low power link up. enables a decrease in link speed in non-d0a states when power policy and power management states dictate it. see section 3.5.7.6.4, low power link up - link speed control . this bit is loaded to each of the phys only when lan0/1 oem bits disable (word 0x23 bit 8:7) are cleared. bit name hardware default loaded from eeprom: 0xe30c description 15 sdpdir[3] 0b 1b sdp3 pin ? initial direction. this bit configures the initial hardware value of the sdp3_iodir bit in the extended device control (ctrl_ext) register following power up. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) .
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 225 14 sdpdir[2] 0b 1b sdp2 pin ? initial direction. this bit configures the initial hardware value of the sdp2_iodir bit in the extended device control (ctrl_ext) register following power up. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 13 phy_in_lan _disable 0b 1b determines the behavior of the mac and phy when a lan port is disabled through an external pin. 0b = mac and phy are kept functional in lan disable (to support manageability). 1b = mac and phy are powered down in lan disable (manageability cannot access the network through this port). 12 reserved 0b 0b reserved - must be zero. 11 lan disable select 0b 0b lan disable. when set to 1b, the appropriate lan is disabled. 10 lan pci disable 0b 0b lan pci disable. when set to 1b, the appropriate lan pci function is disabled. for example, the lan is functional for manageability operation but is not connected to the host through the pcie interface. reflected in eediag. see section 8.4.5, eeprom diagnostic - eediag (0x01038; ro) . 9 sdpdir[1] 0b 1b sdp1 pin ? initial direction. this bit configures the initial hardware value of the sdp1_iodir bit in the device control (ctrl) register following power up. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 8 sdpdir[0] 0b 1b sdp0 pin ? initial direction. this bit configures the initial hardware value of the sdp0_iodir bit in the device control (ctrl) register following power up. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 7 sdpval[3] 0b 0b sdp3 pin ? initial output value. this bit configures the initial power on value output on sdp3 (when configured as an output) by configuring the initial hardware value of the sdp3_data bit in the extended device control (ctrl_ext) register after power up. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 6 sdpval[2] 0b 0b sdp2 pin ? initial output value. this bit configures the initial power-on value output on sdp2 (when configured as an output) by configuring the initial hardware value of the sdp2_data bit in the extended device control (ctrl_ext) register after power up. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 5 wd_sdp0 0b 0b when set, sdp[0] is used as a watchdog timeout indication. when reset, it is used as an sdp (as defined in bits 8 and 0). see section 8.2.1, device control register - ctrl (0x00000; r/w) . 4 giga disable 0b 0b when set, gbe operation is disabled. a usage example for this bit is to disable gbe operation if system power limits are exceeded. this bit is loaded to the phy only when lan1_oem_dis (word 0x23 bit 8) is cleared.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 226 6.2.9 software defined pins control lan0 (word 0x20) this word is used to configure initial settings for the software definable pins (sdps) for lan0. 3 disable 1000 in non-d0a 0b 1b disables 1000 mb/s operation in non-d0a states. this bit is loaded to the phy only when lan1_oem_dis (word 0x23 bit 8) is cleared. see section 3.5.7.6.4, low power link up - link speed control . 2 d3cold_ wakeup_ adven 1b 1b configures the initial hardware default value of the advd3wuc bit in the device control (ctrl) register following power up.see section 8.2.1, device control register - ctrl (0x00000; r/w) . 1 sdpval[1] 0b 0b sdp1 pin ? initial output value. this bit configures the initial power on value output on sdp1 (when configured as an output) by configuring the initial hardware value of the sdp1_data bit in the device control (ctrl) register after power up.see section 8.2.1, device control register - ctrl (0x00000; r/w) . 0 sdpval[0] 0b 0b sdp0 pin ? initial output value. this bit configures the initial power-on value output on sdp0 (when configured as an output) by configuring the initial hardware value of the sdp0_data bit in the device control (ctrl) register after power up.see section 8.2.1, device control register - ctrl (0x00000; r/w) . bit name hardware default loaded from eeprom: 0xe30c description 15 sdpdir[3] 0b 1b sdp3 pin ? initial direction. this bit configures the initial hardware value of the sdp3_iodir bit in the extended device control (ctrl_ext) register following power up. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 14 sdpdir[2] 0b 1b sdp2 pin ? initial direction. this bit configures the initial hardware value of the sdp2_iodir bit in the extended device control (ctrl_ext) register following power up. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 13 phy_in_lan _disable 0b 1b determines the behavior of the mac and phy when a lan port is disabled through an external pin. 0b = mac and phy are kept functional in lan disable (to support manageability). 1b = mac and phy are powered down in lan disable (manageability cannot access the network through this port). reflected in eediag. see section 8.4.5, eeprom diagnostic - eediag (0x01038; ro) . 12:10 reserved 0b 0b reserved - must be zero. 9 sdpdir[1] 0b 0b sdp1 pin ? initial direction. this bit configures the initial hardware value of the sdp1_iodir bit in the device control (ctrl) register following power up. see section 8.2.1, device control register - ctrl (0x00000; r/w) .
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 227 6.2.10 eeprom sizing and protected fields (word 0x12) provides indication on eeprom size and protection. note: if the enable protection bit in this word is set and the signature is valid, the software device driver has read but no write access to this word via the eec and eerd registers; in this case, write access is possible only via an authenticated firmware interface. 8 sdpdir[0] 0b 0b sdp0 pin ? initial direction. this bit configures the initial hardware value of the sdp0_iodir bit in the device control (ctrl) register following power up. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 7 sdpval[3] 0b 1b sdp3 pin ? initial output value. this bit configures the initial power-on value output on sdp3 (when configured as an output) by configuring the initial hardware value of the sdp3_data bit in the extended device control (ctrl_ext) register after power up. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 6 sdpval[2] 0b 1b sdp2 pin ? initial output value. this bit configures the initial power-on value output on sdp2 (when configured as an output) by configuring the initial hardware value of the sdp2_data bit in the extended device control (ctrl_ext) register after power up. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 5 wd_sdp0 0b 0b when set, sdp[0] is used as a watchdog timeout indication. when reset, it is used as an sdp (as defined in bits 8 and 0). see section 8.2.1, device control register - ctrl (0x00000; r/w) . 4 giga disable 0b 0b when set, gbe operation is disabled. a usage example for this bit is to disable gbe operation if system power limits are exceeded. this bit is loaded to the phy only when lan0_oem_dis (word 0x23 bit 7) is cleared. 3 disable 1000 in non-d0a 0b 0b disables 1000 mb/s operation in non-d0a states. this bit is loaded to the phy only when lan0_oem_dis (word 0x23 bit 7) is cleared. see section 3.5.7.6.4, low power link up - link speed control . 2 d3cold_wa keup_adve n 1b 0b configures the initial hardware default value of the advd3wuc bit in the device control (ctrl) register following power up. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 1 sdpval[1] 0b 1b sdp1 pin ? initial output value. this bit configures the initial power-on value output on sdp1 (when configured as an output) by configuring the initial hardware value of the sdp1_data bit in the device control (ctrl) register after power up. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 0 sdpval[0] 0b 1b sdp0 pin ? initial output value. this bit configures the initial power-on value output on sdp0 (when configured as an output) by configuring the initial hardware value of the sdp0_data bit in the device control (ctrl) register after power up. see section 8.2.1, device control register - ctrl (0x00000; r/w) .
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 228 6.2.11 reserved (word 0x13) ? hardware default: 0x0; loaded from eeprom: 0x0. bit name hardware default loaded from eeprom: 0x5c00 description 15:14 signature 01b 01b the signature field indicates to the 82576 that there is a valid eeprom present. if the signature field is 01b, eeprom read is performed, otherwise the other bits in this word are ignored, no further eeprom read is performed, and default values are used for the configuration space ids. 13:10 eeprom size 0111b 0111b these bits indicate the eeprom?s actual size. mapped to eec[14:11]. 0000b = 128 bytes 0001b = 256 bytes 0010b = 512 bytes 0011b = 1 kb 0100b = 2 kb 0101b = 4 kb 0110b = 8 kb 0111b = 16 kb 1000b = 32 kb 1001b = reserved 1011b = reserved see section 8.4.1, eeprom/flash control register - eec (0x00010; r/w) . 9:5 reserved 00000b 00000b reserved - must be zero. 4 enable eeprom protection 0b 0b if set, all eeprom protection schemes are enabled. 3:0 hepsize 0000b 0000b hidden eeprom block size. this field defines the area at the end of the eeprom memory accessible only by manageability firmware. it can be used to store secured data and other manageability functions. the size in bytes of the secured area equals: 0 bytes if hepsize equals zero 2^ hepsize bytes else (for example, 2 b, 4 b, ?32 kb)
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 229 6.2.12 initialization control 3 (word 0x14, 0x24) this word controls general initialization values. bit name hardware default word 0x14 loaded from eeprom: 0x8c00 word 0x24 loaded from eeprom: 0x8400 description 15 serdes energy source 0b 1b 1b serdes energy source detection. when set to 0b, internal serdes rx electrical idle indication. when set to 1b, external los signal. this bit also indicates the source of the signal detect while establishing a link in serdes mode. this bit sets the default value of the connsw.enrgsrc bit. see section 8.2.6, copper/fiber switch control - connsw (0x00034; r/w) . 14 2 wires sfp enable 0b 0b 0b 2 wires interface sfp enable. 0b = disabled. when disabled, the 2 wires i/f pads are isolated. 1b = enabled. used to set the default value of ctrl_ext[25]. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 13 lan flash disable 1b 0b 0b a value of 1b disables the flash logic. flash access bar in the pci configuration space is disabled. 12:1 1 interrupt pin 00b lan 0 01b lan 1 01b 00b controls the value advertised in the interrupt pin field of the pci configuration header for this device/function. the encoding of this field is as follow: value aaa int line aaa interrupt pin field value 00b aaaaaa inta aaaaaaaaaaaaaaa 1 01b aaaaaa intb aaaaaaaaaaaaaaa 2 10b aaaaaa intc aaaaaaaaaaaaaaa 3 11b aaaaaa intd aaaaaaaaaaaaaaa 4 if only a single device/function of the 82576 component is enabled, this value is ignored and the interrupt pin field of the enabled 82576 reports inta# usage. see section 9.4.18, interrupt pin register (0x3d; ro) . 10 apm enable 0b 1b 1b initial value of advanced power management wake up enable bit in the wake up control (wuc.apme) register. mapped to ctrl[6] and to wuc[0]. see section 8.2.1, device control register - ctrl (0x00000; r/w) and section 8.20.1, wakeup control register - wuc (0x05800; r/w) .
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 230 the following tables lists the different combinations of bits 13 and 7: 9:8 link mode 00b 00b 00b initial value of link mode bits of the extended device control (ctrl_ext.link_mode) register, specifying which link interface and protocol is used by the mac. 00b = mac operates with internal copper phy (1000base-t). 01b = reserved. 10b = mac operates in sgmii mode. 11b = mac operates in serdes mode. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 7 lan boot disable 1b 0b 0b a value of 1b disables the expansion rom bar in the pci configuration space. 6:2 reserved 0b 00000b 0b reserved - must be zero. 1 ext_vlan 0b 0b 0b sets the default for ctrl_ext[26] bit. indicates that additional vlan is expected in this system. see section 8.2.3, extended device control register - ctrl_ext (0x00018; r/w) . 0 keep_phy_li nk_up_en 0b 0b 0b enables no phy reset when the baseboard management controller (bmc) indicates that the phy should be kept on. when asserted, this bit prevents the phy reset signal and the power changes reflected to the phy according to the manc.keep_phy_link_up value. this bit should be set to the same value at both words (0x14 and 0x24) to reflect the same option to both lans. flash disable (bit 13) boot disable (bit 7) functionality (active windows) 0b 0b flash and expansion rom bars are active. 0b 1b flash bar is enabled and expansion rom bar is disabled. 1b 0b flash bar is disabled and expansion rom bar is enabled. 1b 1b flash and expansion rom bars are disabled.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 231 6.2.13 pcie completion timeout configuration (word 0x15) 6.2.14 msi-x configuration (word 0x16) 6.2.15 pcie init configuration 1 word (word 0x18) bit name hardware default loaded from eeprom: 0x0014 description 15 reserved 0b 0b reserved - must be zero. 14:12 reserved 0x0 0x0 reserved - must be zero. 11:8 reserved 0x0 0x0 reserved - must be zero 7 completion timeout disable 0b 0b disables the pcie completion timeout mechanism. 0b = completion timeout enabled. 1b = completion timeout disabled. see section 8.6.1, pcie control - gcr (0x05b00; rw) . this bit is relevant only if the gio cap field in word 0x1a is set to 01b. 6:5 completion timeout value 0x0 0x0 determines the range of the pcie completion timeout. 00b = 50 ? s to 10 ms 01b = 10 ms to 250 ms 10b = 250 ms to 4 s 11b = 4 s to 64 s see section 9.5.5.12, device control 2 register (0xc8; rw) . this field is relevant only if the gio cap field in word 0x1a is set to 01b. 4 completion timeout resend 1b 1b when set, enables to resend a request once the completion timeout expired 0b = do not re-send request on completion timeout. 1b = re-send request on completion timeout. see section 9.5.5.12, device control 2 register (0xc8; rw) 3:0 reserved 0100b 0100b reserved. bit name hardware default loaded from eeprom: 0x4a40 description 15:1 1 msi_x0_n 0x9 0x9 this field specifies the number of entries in msi-x tables of lan 0. the range is 0-24. msi_x_n is equal to the number of entries minus one. see section 9.5.3.3, message control register (0x72; r/w) . 10:6 msi_x1_n 0x9 0x9 this field specifies the number of entries in msi-x tables of lan 1. the range is 0-24. msi_x_n is equal to the number of entries minus one. see section 9.5.3.3, message control register (0x72; r/w) . 5:0 reserved 0x0 0 0000 reserved - must be zero.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 232 this word is used to: ? set defaults for some internal registers. ? enable/disable specific features. 6.2.16 pcie init configuration 2 word (word 0x19) this word is used to set defaults for some internal pcie configuration registers. 6.2.17 pcie init configuration 3 word (word 0x1a) this word is used to set defaults for some internal registers. bit name hardware default loaded from eeprom: 0x6cf6 description 15 reserved 0b 0b reserved - must be zero. 14:1 2 l1_act_ext_l atency 0x6 (32 ms to 64 ms) 0x6 (32 ms to 64 ms) l1 active exit latency for the configuration space. see section 9.5.5.7, link cap register (0xac; ro) . 11:9 l1_act_acc_l atency 0x6 (32 ms to 64 ms) 0x6 (32 ms to 64 ms) l1 active acceptable latency for the configuration space. see section 9.5.5.4, device capability register (0xa4; rw) . 8:6 l0s_acc_late ncy 0x3 (512 ns) 0x3 (512 ns) l0s acceptable latency for the configuration space. see section 9.5.5.4, device capability register (0xa4; rw) . 5:3 l0s_se_ext_ latency 0x6 0x6 l0s exit latency for active state power management (separated reference clock) ? (latency between 64 ns ? 128 ns). see section 9.5.5.7, link cap register (0xac; ro) . 2:0 l0s_co_ext_ latency 0x5b 0x6 l0s exit latency for active state power management (common reference clock) ? (latency between 64 ns ? 128 ns). see section 9.5.5.7, link cap register (0xac; ro) . bit name hardware default loaded from eeprom: 0xd7b0 description 15 reserved 1b 1b reserved - must be one. 14 io_sup 1b 1b i/o support (effect i/o bar request). when set to 1b, i/o is supported. 13 reserved 0b 0b reserved - must be zero. 12 serial number enable 0b 1b when set, the pcie serial number capability is exposed in the configuration space. see section 9.6.2, serial number for details. 11: 8 reserved 0x7 0x7 reserved - must be 0111b. 7:0 reserved 0xb0 0xb0 reserved - must be 0xb0.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 233 6.2.18 pcie control (word 0x1b) used to configure initial settings for the pcie default functionality. bit name hardware default loaded from eepro: 0x0abe description 15:13 reserved 0b 000b reserved - must be zero. 12 cache_lsize 0b 0b cache line size. 0b = 64 bytes. 1b = 128 bytes. 11:10 gio_cap 10b 10b pcie capability version. the value of this field is reflected in the two lsbs of the capability version in the pcie cap register (config space ? offset 0xa2). note: this is not the pcie version. it is the pcie capability version. this version is a field in the pcie capability structure and is not the same as the pcie version. it changes only when the content of the capability structure changes. for example, pcie 1.0, 1.0a, and 1.1 all have a capability version of one. pcie 2.0 has a version two because it added registers to the capabilities structures. see section 9.5.5.3, pcie cap register (0xa2; ro) . 9:8 max payload size 10b 10b default packet size. 00b = 128 bytes. 01b = 256 bytes. 10b = 512 bytes. 11b = reserved. see section 9.5.5.4, device capability register (0xa4; rw) . 7:6 lane_width 10b 10b max link width. 00b = 1 lane. 01b = 2 lanes. 10b = 4 lanes. 11b = reserved. see section 9.5.5.7, link cap register (0xac; ro) . 5:4 reserved 11b 11b reserved 3:2 act_stat_pm _sup 11b 11b determines support for active state link power management. loaded into the pcie active state link pm support register. see section 9.5.5.7, link cap register (0xac; ro) . 1 slot_clock_cf g 1b 1b when set, the 82576 uses the pcie reference clock supplied on the connector (for add-in solutions). 0 reserved 0b 0b reserved - must be zero. 010
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 234 6.2.19 led 1,3 configuration defaults (word 0x1c, 0x2a) these eeprom words specify the hardware defaults for the ledctl register fields controlling the led1 (activity indication) and led3 (link_1000 indication) output behaviors. word 0x1c controls lan0 leds behavior and word 0x2a controls lan1. bit name hardware default loaded from eepro: 0x8403 description 15 enable wake# assertion 0b 1b enable wake# assertion when pcie link up. 14 dummy function enable 0b 0b 0b = when function 0 is disabled, it is replaced by function 1. 1b = when function 0 is disabled, it is replaced with a dummy function. 13 reserved 0b 0b reserved must be 0. 12 lane reversal disable 0b 0b when set, disables the ability to negotiate lane reversal. 11 reserved 0b 0b reserved. 10 reserved 1b 1b reserved. 9:2 reserved 0b 0b reserved must be 0. 1:0 latency_to_e nter_l1 11b 11b period in l0s state before transition into an l1 state: 00b = 64 ? sec. 01b = 256 ? sec. 10b = 1 msec 11b = 4 msec bit name hardware default word 0x1c loaded from eeprom: 0x0783 word 0x2a loaded from eeprom: 0x0783 description 15 led3 blink 0b 0b 0b initial value of led3_blink field. 0b = non-blinking. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 14 led3 invert 0b 0b 0b initial value of led3_ivrt field. 0b = active-low output. see section 8.2.1, device control register - ctrl (0x00000; r/ w) . 13 reserved 0b 0b 0b reserved - must be zero. 12 reserved 0b 0b 0b reserved - must be zero. 11: 8 led3 mode 0x7 0x7 0x7 initial value of the led3_mode field specifying what event/state/pattern is displayed on led3 (link_1000) output. a value of 0111b (0x7) indicates 1000 mb/s operation. see section 8.2.1, device control register - ctrl (0x00000; r/w) .
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 235 a value of 0x0703 is used to configure default hardware led behavior equivalent to previous copper adapters (led0=link_up, led1=blinking activity, led2=link_100, and led3=link_1000). 7 led1 blink 1b 1b 1b initial value of led1_blink field. 0b = non-blinking. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 6 led1 invert 0b 0b 0b initial value of led1_ivrt field. 0b = active-low output. see section 8.2.1, device control register - ctrl (0x00000; r/ w) . 5 reserved 0b 0b 0b reserved - must be zero. 4 reserved 0b 0b 0b reserved - must be zero. 3:0 led1 mode 0x3 0x3 0x3 initial value of the led1_mode field specifying what event/state/pattern is displayed on led1 (activity) output. a value of 0011b (0x3) indicates the activity state. see section 8.2.1, device control register - ctrl (0x00000; r/w) .
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 236 6.2.20 device rev id (word 0x1e) 6.2.21 led 0,2 configuration defaults (word 0x1f, 0x2b) these eeprom words specify the hardware defaults for the ledctl register fields controlling the led0 (link_up) and led2 (link_100) output behaviors. word 0x1f controls lan0 leds behavior and word 0x2b controls lan1. bit name hardware default loaded from eeprom: 0x0001 description 15 power down enable 0b 0b device off (dynamic iddq) enable/disable bit. see section 5.2.4.1, dr disable mode for details. reflected in eediag ( section 8.4.5, eeprom diagnostic - eediag (0x01038; ro)) . 14 reserved 0b 0b reserved - must be zero. 13 reserved 0b 0b reserved - must be zero. 12 lan 1 iscsi enable 0b 0b when set, lan 1 class code is set to 0x010000 (scsi). when reset, lan 1 class code is set to 0x020000 (lan). see section 9.4.5, revision register (0x8; ro). 11 lan 0 iscsi enable 0b 0b when set, lan 0 class code is set to 0x010000 (scsi). when reset, lan 0 class code is set to 0x020000 (lan). see section 9.4.5, revision register (0x8; ro). 10:8 reserved 0b 0b reserved - must be zero. 7:0 devrevid 0x1 0x1 device revision id. for the 82576 a1, the default value is one. see section 9.4.5, revision register (0x8; ro) . bit name hardware default word 0x1b loaded from eeprom: 0x8403 word 0x2b loaded from eeprom: 0x0602 description 15 led2 blink 0b 1b 0b initial value of led2_blink field. 0b = non-blinking. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 14 led2 invert 0b 0b 1b initial value of led2_ivrt field. 0b = active-low output. see section 8.2.1, device control register - ctrl (0x00000; r/ w) . 13 reserved 0b 0b 0b reserved - must be zero. 12 reserved 0b 0b 0b reserved - must be zero. 11: 8 led2 mode 0x6 0x4 0x6 initial value of the led2_mode field specifying what event/state/pattern is displayed on led2 (link_100) output. a value of 0110b (0x6) indicates 100 mb/s operation. see section 8.2.1, device control register - ctrl (0x00000; r/w) .
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 237 a value of 0x0602 is used to configure default hardware led behavior equivalent to previous copper adapters (led0=link_up, led1=blinking activity, led2=link_100, and led3=link_1000). 7 led0 blink 0b 0b 0b initial value of led0_blink field. 0b = non-blinking. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 6 led0 invert 0b 0b 0b initial value of led0_ivrt field. 0b = active-low output. see section 8.2.1, device control register - ctrl (0x00000; r/ w) . 5 global blink mode 0b 0b 0b global blink mode. 0b = blink at 200 ms on and 200ms off. 1b = blink at 83 ms on and 83 ms off. see section 8.2.1, device control register - ctrl (0x00000; r/w) . 4 reserved 0b 0b 0b reserved - must be zero. 3:0 led0 mode 0x2 0x3 0x2 initial value of the led0_mode field specifying what event/state/pattern is displayed on led0 (link_up) output. a value of 0010b (0x2) indicates the link_up state. see section 8.2.1, device control register - ctrl (0x00000; r/w) .
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 238 6.2.22 functions control (word 0x21) bit name hardware default loaded from eeprom: 0x2020 description 15 nc-si clock pad drive strength 0b 0b defines the drive strength of the nc-si_clk_out pad. if set, the driving strength is doubled. see section 11.4.2.4, nc-si input and output pads for details. reflected in eediag (see section 8.4.5, eeprom diagnostic - eediag (0x01038; ro)) . 14 nc-si data pad drive strength 0b 0b defines the drive strength of the nc-si_dv, nc-si_rxd[1:0] and nc-si_arb_out pads. if set, the driving strength is doubled. see section 11.4.2.4, nc-si input and output pads for details. reflected in eediag (see section 8.4.5, eeprom diagnostic - eediag (0x01038; ro)) . 13 nc-si output clock disable 0b 1b if set, the clock source is external. in this case, the nc- si_clk_out pad is kept stable at zero, and the nc-si_clk_in pad is used as an input source of the clock. if cleared, the 82576 outputs the nc-si clock through the nc- si_clk_out pad. the nc-si_clk_in pad is still used as an nc- si clock input. if nc-si is not used, then this bit should be set. if this bit is cleared, the device power down enable bit in word 0x1e (bit 15) should not be set. reflected in eediag (see section 8.4.5, eeprom diagnostic - eediag (0x01038; ro)) . 12 lan function select 0b 0b when both lan ports are enabled and lan function sel = 0b, lan 0 is routed to pci function 0 and lan 1 is routed to pci function 1. if lan function sel = 1b, lan 0 is routed to pci function 1 and lan 1 is routed to pci function 0. this bit is mapped to factps[30]. see see section 8.6.4 . 11: 10 bar mapping 00b 00b 00b = 32 bit bars. 01b = reserved 10b = 64 bit bars no i/o bar 11b = 64 bit bars no flash bar. see section 9.4.11, base address registers (0x10:0x27; r/w) . 9 prefetchable 0b 0b 0b = bars are marked as non prefetchable. 1b = bars are marked as prefetchable. see section 9.4.11, base address registers (0x10:0x27; r/w) . 8:6 reserved 0b 0b reserved - must be zero. 5 reserved 1b 1b reserved - must be one. 4:0 reserved 0b 0b reserved - must be zero.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 239 6.2.23 lan power consumption (word 0x22) 6.2.24 i/o virtualization (iov) control (word 0x25) this word controls iov functionality. bit name hardware default loaded from eeprom 0x1ae5 description 15: 8 lan d0 power 0x0 0x1a the value in this field is reflected in the pci power management data register of the lan functions for d0 power consumption and dissipation ( data_select = 0 or 4). power is defined in 100mw units. the power includes also the external logic required for the lan function. see section 9.5.1.4, power management control / status register - pmcsr (0x44; r/w) . 7:5 function 0 common power 0x0 0x7 the value in this field is reflected in the pci power management data register of function 0 when the data_select field is set to 8 (common function). the msbs in the data register that reflects the power values are padded with zeros. see section 9.5.1.4, power management control / status register - pmcsr (0x44; r/w) . 4:0 lan d3 power 0x0 0x5 the value in this field is reflected in the pci power management data register of the lan functions for d3 power consumption and dissipation ( data_select = 3 or 7). power is defined in 100 mw units. the power also includes the external logic required for the lan function. the msbs in the data register that reflects the power values are padded with zeros. see section 9.5.1.4, power management control / status register - pmcsr (0x44; r/w) . bit name hardware default loaded from eeprom 0x00f7 description 15:8 reserved 0x0 0x0 reserved - must be zero. 7:5 max vfs 0x7 0x7 defines the value of maxvf exposed in the iov structure. valid values are 0-7. the value exposed is the value of this field + one. 4:3 msi-x table 0x2 0x2 defines the size of the vf function msi-x table to request. valid values are 0-2. 2 64-bit advertisemen t 1b 1b 0b = vf bars advertise 32-bit size. 1b = vf bars advertise 64-bit size. 1 prefetchable 0b 1b 0b = iov memory bars (0 and 3) are declared as non prefetchable. 1b = iov memory bars (0 and 3) are declared as prefetchable. 0 iov enabled 1b 1b 0b = iov and ari capability structures are not exposed as part of the capabilities link list. 1b = iov and ari capability structures are exposed as part of the capabilities link list.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 240 6.2.25 iov device id (word 0x26) this word defines the device id for virtual functions. 6.2.26 end of read-only (ro) area (word 0x2c) defines the end of the area in the eeprom that is ro. 6.2.27 start of ro area (word 0x2d) defines the start of the area in the eeprom that is ro. 6.2.28 watchdog configuration (word 0x2e) 6.2.29 vpd pointer (word 0x2f) this word points to the vital product data (vpd) structure. this structure is available for the nic vendor to store it's own data. a value of 0xffff indicates that the structure is not available. bit name hardware default loaded from eeprom: 0x10ca description 15:0 vdev id 0x10ca 0x10ca virtual function device id. bit name hardware default loaded from eeprom: 0x0000 description 15 reserved 0b 0b reserved - must be zero. 14:0 eoro_area 0x0 0x0 defines the end of the area in the eeprom that is ro. the resolution is one word and can be up to byte address 0xffff (0x7fff words). a value of zero indicates no ro area. bit name hardware default loaded from eeprom: 0x00000 description 15 reserved 0b 0b reserved - must be zero. 14:0 soro_area 0x0 0x0 defines the start of the area in the eeprom that is ro. the resolution is one word and can be up to byte address 0xffff (0x7fff words). bit name hardware default loaded from eeprom: 0x0000 description 15 watchdog enable 0b 0b enable watchdog interrupt. see section 8.16.1, watchdog setup - wdstp (0x01040; r/w) . 14:11 watchdog timeout 0x2 0x0 watchdog timeout period (in seconds). see section 8.16.1, watchdog setup - wdstp (0x01040; r/w) . 10:0 reserved 0x0 0x0 reserved - must be zero.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 241 6.2.30 nc-si arbitration enable (word 0x40) 6.3 analog blocks configuration structures 6.3.1 analog configuration pointers start address (offset 0x17) note: word 0x17 points to the pointers of three configuration blocks: serdes, phy, and pcie. 6.3.2 pcie initialization pointer (offset 0, relative to word 0x17 value) bit name hardware default loaded from eeprom: 0xffff description 15:0 vpd offset 0xffff 0xffff offset to vpd structure in words. bits 15:9 must be set to 0 (the vpd area must be in the first 1 kbyte of eeprom). bit hardware default loaded from eeprom: 0x0001 description 15:2 0x0 0x0 reserved - must be 0x0. 1 0b 0b reserved - must be 1b. 0 1b 1b 0 = ncsi_arb_in and ncsi_arb_out pads are not used. ncsi_arb_in is pulled up internally to provide stable input. 1 = ncsi_arb_in and ncsi_arb_out pads are used. bit(s) name loaded from eeprom: 0x0060 description 15:0 address 0x0060 defines the word address in the eeprom of the pointers to the phy, pcie, and serdes initialization spaces. bit name description 15:0 pcie config pointer defines the location of the pcie initialization structure. from this location, the pcie lane all, pcie lanes 0/1/2/3, ccm and pll structures are linked.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 242 6.3.3 phy initialization pointer (offset 1, relative to word 0x17 value) 6.3.4 serdes initialization pointer (offset 2, relative to word 0x17 value) 6.4 serdes/phy/pcie/pll/ccm initialization structures 6.4.1 block header (offset 0x0) bit name description 15:0 phy config pointer defines the location of the phy initialization structure. from this location, the phy structures are linked. bit name description 15:0 serdes config pointer defines the location of the serdes initialization structure. from this location, the serdes structures are linked. bit name description 15:12 destination type destination type. defines the module type that this block configures: 0x0h = 802.3 phy. 0x1h = 802.3 serdes. 0x2h = ccm, gbe pll. 0x3h = pcie lane all; write to all four pcie lanes together. 0x4h = pcie pll. 0x5h = pcie lane 0. 0x6h = pcie lane 1. 0x7h = pcie lane 2. 0x8h = pcie lane 3.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 243 6.4.2 crc8 (offset 1) 6.4.3 next buffer pointer (offset 2 - optional) 6.4.4 address/data (offset 3:word count) 11:10 next block next block. 00b = the next configuration block proceeds at the end of this one. 01b = this is the last configuration block. 10b = the next configuration block starts at an offset defined by the nbp (second, optional header word). 11b = reserved. 9:8 core destination indicates the port to be accessed. 00b = lan0. 01b = lan1. 10b = both cores. this block should be written for both lan 0 and lan 1. this field is relevant only if the destination is 802.3 phy or serdes blocks. 7:0 word count size of this structure. bit name description 15:8 block crc crc8. 7:0 reserved reserved - must be zero. bit name description 15:0 nbp pointer to the starting word of the next configuration block. bit name description 15:8 address internal register address that are written to. refer to the following table. 7:0 data data to write. id structure type register to use address 0 phy mdic 0x20 1 serdes serdesctl 0x24 2 ccm ccmctl 0x5b48 3 all lanes gioanactlall 0x5b44 4 pcie pll scctl 0x5b4c
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 244 for the phy configuration structure, the description for the configuration words is as follows: 6.5 firmware pointers & control words words 0x51:0x52 are used to point to load & no manageability patches and the test structure. words 0x55:0x57 are used to point to firmware structures specific to pt. words 0x54 & 0x23 control some aspects of the fw functionality. a value of zero for a pointer indicates the relevant structure is not present in the eeprom. 6.5.1 loader patch pointer (word 0x51) 6.5.2 no manageability patch pointer (word 0x52) 5 lane 0 gioanactl0 0x5b34 6 lane 1 gioanactl1 0x5b38 7 lane 2 gioanactl2 0x5b3c 8 lane 3 gioanactl3 0x5b40 bit name description 15:0 mdic value even words: bits 15:0; odd words: bits 31:16. bit name description 15:0 pointer pointer to loader patch structure. see section 6.6, patch structure for details of the structure. bit name description 15:0 pointer pointer to no manageability patch structure. see section 6.6, patch structure for details of the structure. id structure type register to use address
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 245 6.5.3 manageability capability/manageability enable (word 0x54) 6.5.4 pt patch configuration pointer (word 0x55) 6.5.5 pt lan0 configuration pointer (word 0x56) bit name hardware default loaded from eeprom: 0x0000 description 15 reserved 0b 0b reserved - must be zero. 14 redirection sideband interface 0b 0b 0b = smbus. 1b = nc-si. 13:11 reserved 0x0 0x0 reserved - must be zero. 10:8 manageability mode 0x0 0x0 0x0 = none. 0x1 = reserved. 0x2 = pass through (pt) mode. 0x3 = reserved. 0x4 = host interface enable only. 0x5:0x7 = reserved. 7 port1 manageability capable 0b 0b 0 = not capable. 1 = bits 3 is applicable to port 1. 6 port0 manageability capable 0b 0b 0 = not capable. 1 = bits 3 is applicable to port 0. 5:4 reserved 0b 0b reserved - must be zero. 3 pass through capable 0b 0b 0b = disable. 1b = enable. 2:0 reserved 0x0 0x0 reserved - must be zero. bit name description 15:0 pointer pointer to the pt patch configuration pointer structure. see section 6.6, patch structure for details of the structure. bit name description 15:0 pointer pointer to the pt lan0 configuration pointer structure.see section 6.7, pt lan configuration structure for details of the structure.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 246 6.5.6 sideband configuration pointer (word 0x57) = 6.5.7 flex tco filter configuration pointer (word 0x58) 6.5.8 pt lan1 configuration pointer (word 0x59) 6.5.9 management hw config control (word 0x23) this word contain bits that direct firmware special behavior when configuring the phy, pcie, and serdes interfaces. bit name description 15:0 pointer pointer to the sideband configuration pointer structure. see section 6.8, sideband configuration structure for details of the structure. bit name description 15:0 pointer pointer to the flex tco configuration pointer structure. see section 6.9, flex tco filter configuration structure for details of the structure. bit name description 15:0 pointer pointer to the pt lan1 configuration pointer structure. see section 6.7, pt lan configuration structure for details of the structure. bit name hardware default loaded from eeprom: 0x0000 description 15 lan1_ftco_ dis 0b 0b lan1 force tco reset disable (1b disable; 0b enable). 14 lan0_ftco_ dis 0b 0b lan0 force tco reset disable (1b disable; 0b enable). 13:10 reserved 0b 0b reserved - must be zero. 9 fw code exist 0b 0b if set, indicates to the firmware that there is firmware eeprom code at address 0x50. 8 lan1_oem_d is 0b 0b lan1 oem bits configuration disable. 7 lan0_oem_d is 0b 0b lan0 oem bits configuration disable. 6 crc_dis 0b 0b phy, serdes, and pcie crc disable. 5 lan1_rom_d is 0b 0b lan1 rom disable. disables phy and serdes rom configuration for port 1.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 247 6.6 patch structure this structure is used for all the patches in different modes: loader, no manageability, and pass through. 6.6.1 patch data size (offset 0x0) 6.6.2 block crc8 (offset 0x1) 6.6.3 patch entry point pointer low word (offset 0x2) 6.6.4 patch entry point pointer high word (offset 0x3) 4 lan0_rom_d is 0b 0b lan0 rom disable. disables phy and serdes rom configuration for port 0. 3 mng_wake_c heck_dis 0b 0b when set, indicates that the firmware to always configure the phy after power-up without checking if manageability or wake-up is enabled. 2 pcie rom disable 0b 0b when set, indicates to firmware not to configure the pcie from the rom tables. 1 phy rom disable 0b 1b when set, indicates to firmware not to configure the phy of both ports from the rom tables. 0 serdes rom disable 0b 0b when set, indicates to firmware not to configure the serdes of both ports from the rom tables. bit name description 15:0 data size (bytes) bit name description 15:8 reserved reserved - must be zero 7:0 crc8 bit name description 15:0 patch entry point pointer low word bit name description 15:0 patch entry point pointer high word
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 248 6.6.5 patch version 1 word (offset 0x4) 6.6.6 patch version 2 word (offset 0x5) 6.6.7 patch version 3 word (offset 0x6) 6.6.8 patch version 4 word (offset 0x7) 6.6.9 patch data words (offset 0x8, block length) 6.7 pt lan configuration structure used to pre-configure manageability filters so that pass-thru traffic can be received without explicit configuration by the bmc. bit name description 15:8 patch generation hour 7:0 patch generation minutes bit name description 15:8 patch generation month 7:0 patch generation day bit name description 15:8 patch silicon version compatibility 0x00 = a0. 0x01 = a1. 0x10 = b0. 0x11 = b1. 7:0 patch generation year bit name description 15:8 patch major number 7:0 patch minor number bit name description 15:0 patch firmware data
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 249 6.7.1 section header (offset 0x0) 6.7.2 lan0 ipv4 address 0 lsb, mipaf0 (offset 0x01) this value will be stored in the ipv4addr0 register (0x58e0). 6.7.3 lan0 ipv4 address 0 msb, mipaf0 (offset 0x02) this value will be stored in the ipv4addr0 register (0x58e0). 6.7.4 lan0 ipv4 address 1; mipaf1 (offset 0x03:0x04) same structure as lan0 ipv4 address 0. this value will be stored in the ipv4addr1 register (0x58e4). 6.7.5 lan0 ipv4 address 2; mipaf2 (offset 0x05h:0x06) same structure as lan0 ipv4 address 0. this value will be stored in the ipv4addr2 register (0x58e8). 6.7.6 lan0 ipv4 address 3; mipaf3 (offset 0x07h:0x08) same structure as lan0 ipv4 address 0. this value will be stored in the ipv4addr3 register (0x58ec). 6.7.7 lan0 mac address 0 lsb, mmal0 (offset 0x09) this value will be stored in the mmal0 register (0x5910). bit name description 15:8 block crc8 7:0 block length bit name description 15:8 lan0 ipv4 address 0 (byte 1) manageability ip address filter (byte 1). 7:0 lan0 ipv4 address 0 (byte 0) manageability ip address filter (byte 0). bit name description 15:8 lan0 ipv4 address 0 (byte 3) manageability ip address filter (byte 3). 7:0 lan0 ipv4 address 0 (byte 2) manageability ip address filter (byte 2).
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 250 6.7.8 lan0 mac address 0 lsb, mmal0 (offset 0x0a) this value will be stored in the mmal0 register (0x5910). 6.7.9 lan0 mac address 0 msb, mmah0 (offset 0x0b) this value will be stored in the mmah0 register (0x5914). 6.7.10 lan0 mac address 1; mmal/h1 (offset 0x0c:0x0e) same structure as lan0 mac address 0. this value will be stored in the mmal1/mmah1 registers (0x5918/1c). 6.7.11 lan0 mac address 2; mmal/h2 (offset 0x0f:0x11) same structure as lan0 mac address 0. this value will be stored in the mmal2/mmah2 registers (0x5920/24). 6.7.12 lan0 mac address 3; mmal/h3 (offset 0x12:0x14) same structure as lan0 mac address 0. this value will be stored in the mmal3/mmah3 registers (0x5928/2c). 6.7.13 lan0 udp flex filter ports 0:15; mfutp registers (offset 0x15:0x24) this value will be stored in the mfutp register (0x5030 - bits 15:0). bit name description 15:8 lan0 mac address 0 (byte 1) manageability mac address low (byte 1). 7:0 lan0 mac address 0 (byte 0) manageability mac address low (byte 0). bit name description 15:8 lan0 mac address 0 (byte 3) manageability mac address low (byte 3). 7:0 lan0 mac address 0 (byte 2) manageability mac address low (byte 2). bit name description 15:8 lan0 mac address 0 (byte 5) manageability mac address high (byte 5) 7:0 lan0 mac address 0 (byte 4) manageability mac address high (byte 4)
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 251 6.7.14 lan0 vlan filter 0:7; mavtv registers (offset 0x25:0x2c) this value will be stored in the mavtv[7:0] registers (0x5010 - 0x502c). 6.7.15 lan0 manageability filters valid; mfval lsb (offset 0x2d) this value will be stored in the mfval register (0x5824). 6.7.16 lan0 manageability filters valid; mfval msb (offset 0x2e) this value will be stored in the mfval register (0x5824). 6.7.17 lan0 manc value lsb (offset 0x2f) this value will be stored in the manc register (0x5820). bit name description 15:0 lan udp flex filter value management flex udp/tcp port bit name description 15:12 reserved reserved - must be zero 11:0 lan0 vlan filter value vlan id value bit name description 15:8 vlan indicates whether or not the vlan filter registers (mavtv) contain valid vlan tags. bit 8 corresponds to filter 0, etc. 7:4 reserved reserved - must be zero. 3:0 mac indicates whether or not the mac unicast filter registers (mmah and mmal) contain valid mac addresses. bit 0 corresponds to filter 0, etc. bit name description 15:12 reserved reserved - must be zero. 11:8 ipv6 indicates whether or not the ipv6 address filter registers (mipaf) contain valid ipv6 addresses. bit 8 corresponds to address 0, etc. bit 11 (filter 3) applies only when ipv4 address filters are not enabled (manc.en_ipv4_filter=0b). 7:4 reserved reserved - must be zero. 3:0 ipv4 indicates whether or not the ipv4 address filters (mipaf) contain a valid ipv4 address. these bits apply only when ipv4 address filters are enabled (manc.en_ipv4_filter=1b) bit name description 15:0 reserved reserved - must be zero.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 252 6.7.18 lan0 manc value msb (offset 0x30) this value will be stored in the manc register (0x5820). 6.7.19 lan0 receive enable 1 (offset 0x31) bit name description 15:12 reserved reserved - must be zero. 11 macsec mode when set, only packets that matches one of the following 3 conditions will be forwarded to the manageability: ? the packet is a macsec packet authenticated and/or decrypted adequately by the hw. ? the packet ethertype matchesmetf[2] ? the packet ethertype matches metf[3]. 10 net_type net type: 0b = pass only un-tagged packets. 1b = pass only vlan tagged packets. valid only if fixed_net_type is set. 9 fixed_net_type fixed net type: if set, only packets matching the net type defined by the net_type field passes to manageability. otherwise, both tagged and un- tagged packets can be forwarded to the manageability engine. 8 enable ipv4 address filters when set, the last 128 bits of the mipaf register are used to store four ipv4 addresses for ipv4 filtering. when cleared, these bits store a single ipv6 filter. 7 enable xsum filtering to mng when this bit is set, only packets that pass the l3 and l4 checksum are send to the manageability block. 6 bypass vlan when set, vlan filtering is bypassed for mng packets. 5 enable mng packets to host memory this bit enables the functionality of the manc2h register. when set, the packets that are specified in the manc2h registers are also sent to host memory if they pass the manageability filters. 4:0 reserved reserved - must be zero. bit name description 15:8 receive enable byte 12 bmc smbus slave address. 7 enable mc dedicated mac 6 reserved always set to 1b. 5:4 notification method 00b = smbus alert. 01b = asynchronous notify. 10b = direct receive. 11b = reserved. 3 enable arp response
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 253 6.7.20 lan0 receive enable 2 (offset 0x32) 6.7.21 lan0 manc2h value lsb (offset 0x33) this value will be stored in the manc2h register (0x5860). 6.7.22 lan0 manc2h value msb (offset 0x34) this value will be stored in the manc2h register (0x5860). 6.7.23 manageability decision filters; mdef0,1 (offset 0x35) this value will be stored in the mdef0 register (0x5890). 2 enable status reporting 1 enable receive all 0 enable receive tco bit name description 15:8 receive enable byte 14 alert value. 7:0 receive enable byte 13 interface value. bit name description 15:8 reserved must be 0. 7:0 host enable when set, indicates that packets routed by the manageability filters to manageability are also sent to the host. bit 0 corresponds to decision rule 0, etc. bit name description 15:0 reserved reserved - must be zero. bit name description 15:12 flex port controls the inclusion of flex port filtering in the manageability filter decision (or section). bit 12 corresponds to flex port 0, etc. (see also bits 11:0 of the next word). 11 port 0x26f controls the inclusion of port 0x26f filtering in the manageability filter decision (or section). 10 port 0x298 controls the inclusion of port 0x298 filtering in the manageability filter decision (or section). 9 neighbor discovery controls the inclusion of neighbor discovery filtering in the manageability filter decision (or section).
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 254 6.7.24 manageability decision filters; mdef0,2 (offset 0x36) this value will be stored in the mdef0 register (0x5890). reserved - must be zero 6.7.25 manageability decision filters; mdef0,3 (offset 0x37) this value will be stored in the mdef_ext0 register (0x5930). 6.7.26 manageability decision filters; mdef0,4 (offset 0x38) this value will be stored in the mdef_ext0 register (0x5930). 8 arp response controls the inclusion of arp response filtering in the manageability filter decision (or section). 7 arp request controls the inclusion of arp request filtering in the manageability filter decision (or section). 6 multicast controls the inclusion of multicast addresses filtering in the manageability filter decision (and section). 5 broadcast controls the inclusion of broadcast address filtering in the manageability filter decision (or section). 4 unicast controls the inclusion of unicast address filtering in the manageability filter decision (or section). 3 ip address controls the inclusion of ip address filtering in the manageability filter decision (and section). 2 vlan controls the inclusion of vlan addresses filtering in the manageability filter decision (and section). 1 broadcast controls the inclusion of broadcast address filtering in the manageability filter decision (and section). 0 unicast controls the inclusion of unicast address filtering in the manageability filter decision (and section). bit name description 15:12 flex tco controls the inclusion of flex tco filtering in the manageability filter decision (or section). bit 12 corresponds to flex tco filter 0, etc. 11:0 flex port controls the inclusion of flex port filtering in the manageability filter decision (or section). bit 11 corresponds to flex port 0, etc. (see also bits 15:12 of the previous word). bit name description 15:12 reserved reserved - must be zero. 11:8 l2 ethertype or l2 ethertype - controls the inclusion of l2 ethertype filtering in the manageability filter decision (or section). 7:4 reserved reserved - must be zero. 3:0 l2 ethertype and l2 ethertype - controls the inclusion of l2 ethertype filtering in the manageability filter decision (and section).
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 255 6.7.27 manageability decision filters; mdef1:6, 1:4 (offset 0x39:0x50) same as words 0x35 to 0x38 for mdef1:mdef6. these values are stored in the mdef [6:1] registers (0x5894 - 0x58ac) and mdef_ext[6:1] registers (0x5934 - 0x594c). 6.7.28 ethertype data (word 0x 6.7.29 ethertype filter; metf0, 1 (offset 0x51) this value is stored in the metf0 register (0x5060). 6.7.30 ethertype filter; metf0, 1 (offset 0x52) this value is stored in the metf0 register (0x5060). 6.7.31 ethertype filter; metf1:3,1:2 (offset 0x53:0x58) same as words 0x51 and 0x52 for metf1:metf3. these values are stored in the metf[3:1] registers (0x5064 - 0x506c). bit name description 15:0 reserved reserved - must be zero. bit name description 15:0 metf ethertype value to be compared against the l2 ethertype field in the rx packet. bit name description 15 reserved reserved - must be zero 14 polarity 0 = positive filter - forward packets matching this filter to the manageability block. 1 = negative filter - block packets matching this filter from the manageability block. 13:0 reserved reserved - must be zero
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 256 6.7.32 arp response ipv4 address 0 lsb (offset 0x59) 6.7.33 arp response ipv4 address 0 msb (offset 0x5a) 6.7.34 lan0 ipv6 address 0 lsb; mipaf (offset 0x5b) this value will be stored in the mipaf0 register (0x58b0). 6.7.35 lan0 ipv6 address 0 msb; mipaf (offset 0x5c) this value will be stored in the mipaf0 register (0x58b0). 6.7.36 lan0 ipv6 address 0 lsb; mipaf (offset 0x5d) this value will be stored in the mipaf1 register (0x58b4). 6.7.37 lan0 ipv6 address 0 msb; mipaf (offset 0x5e) this value will be stored in the mipaf1 register (0x58b4). bit name description 15:8 arp response ipv4 address byte 1 7:0 arp response ipv4 address byte 0 bit name description 15:8 arp response ipv4 address byte 3 7:0 arp response ipv4 address byte 2 bit name description 15:8 lan0 ipv6 address 0 byte 1 7:0 lan0 ipv6 address 0 byte 0 bit name description 15:8 lan0 ipv6 address 0 byte 3 7:0 lan0 ipv6 address 0 byte 2 bit name description 15:8 lan0 ipv6 address 0 byte 5 7:0 lan0 ipv6 address 0 byte 4
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 257 6.7.38 lan0 ipv6 address 0 lsb; mipaf (offset 0x5f) this value will be stored in the mipaf2 register (0x58b8). 6.7.39 lan0 ipv6 address 0 msb; mipaf (offset 0x60) this value will be stored in the mipaf2 register (0x58b8). 6.7.40 lan0 ipv6 address 0 lsb; mipaf (offset 0x61) this value will be stored in the mipaf3 register (0x58bc). 6.7.41 lan0 ipv6 address 0 msb; mipaf (offset 0x62) this value will be stored in the mipaf3 register (0x58bc). 6.7.42 lan0 ipv6 address 1; mipaf (offset 0x63:0x6a) same structure as lan0 ipv6 address 0. these value are stored in the mipaf[7:4] registers (0x58c0 - 0x58cc). bit name description 15:8 lan0 ipv6 address 0 byte 7 7:0 lan0 ipv6 address 0 byte 6 bit name description 15:8 lan0 ipv6 address 0 byte 9 7:0 lan0 ipv6 address 0 byte 8 bit name description 15:8 lan0 ipv6 address 0 byte 11 7:0 lan0 ipv6 address 0 byte 10 bit name description 15:8 lan0 ipv6 address 0 byte 13 7:0 lan0 ipv6 address 0 byte 12 bit name description 15:8 lan0 ipv6 address 0 byte 15 7:0 lan0 ipv6 address 0 byte 14
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 258 6.7.43 lan0 ipv6 address 2; mipaf (offset 0x6b:0x72) same structure as lan0 ipv6 address 0. these value are stored in the mipaf[11:8] registers (0x58d0 - 0x58dc). 6.8 sideband configuration structure this section defines parameters of the smbus and nc-si interfaces. 6.8.1 section header (offset 0x0) 6.8.2 smbus max fragment size (offset 0x1) 6.8.3 smbus notification timeout and flags (offset 0x2) bit name description 15:8 block crc8 7:0 block length bit name description 15:0 smbus max fragment size (bytes) between 32 and 240 bytes. bit name description 15:8 smbus notification timeout (ms) timeout until the discarding of a packet not read by the external mc completes. 0b - no discard. 7:6 smbus connection speed 00b = slow smbus connection. 01b = fast smbus connection (1 mhz). 10b = reserved. 11b = reserved. 5 smbus block read command 0b = block read command is c0. 1b = block read command is d0. 4 smbus addressing mode 0b = single address mode. 1b = dual address mode. 3 reserved reserved - must be zero
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 259 6.8.4 smbus slave address (offset 0x3) 6.8.5 smbus fail-over register; low word (offset 0x4) 6.8.6 smbus fail-over register; high word (offset 0x5) 2 disable smbus arp functionality 1 smbus arp pec 0 reserved reserved - must be zero bit name description 15:9 smbus 1 slave address dual-address mode only. 8 reserved reserved - must be zero. 7:1 smbus 0 slave address 0 reserved reserved - must be zero. bit name description 15:12 gratuitous arp counter 11:10 reserved reserved - must be zero. 9 enable teaming fail-over on dx 8 remove promiscuous on dx 7 enable mac filtering 6 enable repeated gratuitous arp 5 reserved reserved - must be zero. 4 enable preferred primary 3 preferred primary port 2 transmit pair 1:0 reserved reserved - set to 1. bit name description 15:8 gratuitous arp transmission interval (seconds) 7:0 link down fail-over time
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 260 6.8.7 nc-si configuration (offset 0x6) 6.8.8 nc-si hardware arbitration configuration (offset 0x8) 6.8.9 reserved (offset 0x9 - 0xc) reserved. must be zero 6.9 flex tco filter configuration structure used to pre-configure the manageability-tco flex filters so that pass-thru traffic can be received without explicit configuration by the bmc. this should be used in configuration with the pt-lan configuration structure. 6.9.1 section header (offset 0x0) bit name description 15:11 reserved reserved - must be zero. 10 reserved reserved - must be zero. 9 nc-si hw arbitration supported 0b = not supported. 1b = supported. 8 nc-si hw-based packet copy enable 0b = disable. 1b = enable. 7:5 package id 4:0 reserved. must be 0. bit name description 15:0 token timeout nc-si hw-arbitration token timeout (in 16 ns cycles). in order to get the value if nc-si ref_clk cycles, this field should be multiplied by 4/5. setting value to 0 disables the timeout mechanism. bit name description 15:8 block crc8 7:0 block length
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 261 6.9.2 flex filter length and control (offset 0x01) 6.9.3 flex filter enable mask (offset 0x02:0x09) 6.9.4 flex filter data (offset 0x0a - block length) 6.10 software accessed words words 0x03 to 0x07 in the eeprom image are used for compatibility information. new bits within these fields will be defined as the need arises for determining software compatibility between various hardware revisions. words 0x8 and 0x09 are used to indicate the printed board assembly (pba) number and words 0x42 and 0x43 identifies the eeprom image. words 0x30 to 0x3e have been used for configuration and version values by pxe code. the only exceptions are word 0x3d, which is used for the iscsi boot configuration and word 0x37 used for alternate mac address pointer. bit name description 15:8 flex filter length (bytes) 7:5 reserved reserved - must be zero. 4 last filter 3:2 filter index (3:0) 1 apply filter to lan 1 0 apply filter to lan 0 bit name description 15:0 flex filter enable mask bit name description 15:0 flex filter data
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 262 6.10.1 compatibility (word 0x03) 6.10.2 oem specific (word 0x04) driver software provides a method to identify an external port on a system through a command that causes the led's to blink. based on the setting in word 0x4, the leds drivers should blink between state1 and state2 when a port identification command is issued. bit loaded from eeprom: 0x0410 description 15 0 reserved (set to 0b). 14 0 serdes forced mode enable: 0 = normal operation intel driver will enable pcs_lctl.an_enable 1 = forced mode enable. intel driver will not set pcs_lctl.an_enable 13 0 reserved (set to 0b). 12 0 asf smbus connected. 0b = not connected. 1b = connected. 11 0 lom/not a lom. 0b = nic. 1b = lom. 10 1 server/not a server nic. 0b = client. 1b = server. 9 0 client/not a client nic. 0b = server. 1b = client. 8 0 retail/oem. 0b = retail. 1b = oem. 7:6 00 reserved (set to 00b). 5 0 reserved (set to 1b). 4 1 smbus connected. 0b = not connected. 1b = connected. 3 0 reserved (set to 0b). 2 0 pci bridge/no pci bridge. 0b = pci bridge not present. 1b = pci bridge present. 1:0 00 reserved (set to 00b)
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 263 when word 0x4 is equal to 0xffff or 0x0000, the blinking behavior reverts to a default. 6.10.3 oem specific (word 0x06, 0x07) these words are available for oem use. loaded from sample eeprom: 0xffff 0xffff. 6.10.4 eeprom image revision (word 0x05) this word is valid only for device starter images and indicates the id and version of the eeprom image. 6.10.5 pba number module (word 0x08, 0x09) loaded from sample eeprom: 0xffff 0xffff. the nine-digit printed board assembly (pba) number used for intel manufactured network interface cards (nics) is stored in eeprom. bit loaded from eeprom: 0xffff description 15:12 0xf control for led 3 0000b or 1111b: default led blinking operation is used. 0001b = default in state1 + default in state2. 0010b = default in state1 + led is on in state2. 0011b = default in state1 + led is off in state2. 0100b = led is on in state1 + default in state2. 0101b = led is on in state1 + led is on in state2. 0110b = led is on in state1 + led is off in state2. 0111b = led is off in state1 + default in state2. 1000b = led is off in state1 + led is on in state2. 1001b = led is off in state1 + led is off in state2. all other values are reserved. 11:8 0xf control for led 2 ? same encoding as for led 3. 7:4 0xf control for led 1 ? same encoding as for led 3. 3:0 0xf control for led 0 ? same encoding as for led 3. bit loaded from eeprom: 0x2011 description 15:1 2 0x2 eeprom major version. 11:4 0x01 eeprom minor version. 3:0 0x1 eeprom image id.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 264 through the course of hardware ecos, the suffix field is incremented. the purpose of this information is to enable customer support (or any user) to identify the revision level of a product. network driver software should not rely on this field to identify the product or its capabilities. pba numbers have exceeded the length that can be stored as hex values in two words. for newer nics, the high word in the pba number module is a flag (0xfafa) indicating that the actual pba is stored in a separate pba block. the low word is a pointer to the starting word of the pba block. the following shows the format of the pba number module field for new products. the following provides the format of the pba block; pointed to by word 0x9 above: the new pba block contains the complete pba number and includes the dash and the first digit of the 3- digit suffix which were not included previously. each digit is represented by its hexadecimal-ascii values. the following shows an example pba number (in the new style): older nics have pba numbers starting with [a,b,c,d,e] and are stored directly in words 0x8-0x9. the dash in the pba number is not stored; nor is the first digit of the 3-digit suffix (the first digit is always 0b for older products). the following example shows a pba number stored in the pba number module field (in the old style): 6.10.6 pxe configuration words (word 0x30:3b) pxe configuration is controlled by the following ewords. pba number word 0x8 word 0x9 g23456-003 fafa pointer to pba block word offset description 0x0 length in words of the pba block (default is 0x6) 0x1 ... 0x5 pba number stored in hexadecimal ascii values. pba number word offset 0 word offset 1 word offset 2 word offset 3 word offset 4 word offset 5 g23456-003 0006 4732 3334 3536 2d30 3033 specifies 6 words g2 34 56 -0 03 pba number byte 1 byte 2 byte 3 byte 4 e23456-003 e2 34 56 03
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 265 6.10.6.1 main setup options pci function 0 (word 0x30) the main setup options are stored in word 30h. these options are those that can be changed by the user via the control-s setup menu. word 30h has the following format: bit(s) name hardware default loaded from eeprom: 0x0100 description 15:13 rfu 0x0 0x0 reserved. must be 0. 12:10 fsd 0x0 0x0 bits 12-10 control forcing speed and duplex during driver operation. valid values are: 000b ? auto-negotiate 001b ? 10mbps half duplex 010b ? 100mbps half duplex 011b ? not valid (treated as 000b) 100b ? 10mbps full duplex 101b ? 100mbps full duplex 111b ? 1000mbps full duplex only applicable for copper-based adapters. not applicable to 10gbe. default value is 000b. 9 rsv 0b 0b reserved. set to 0. 8 dsm 1b 1b display setup message. if the bit is set to 1, the press control-s message is displayed after the title message. default value is 1. 7:6 pt 0x0 0x0 prompt time. these bits control how long the ctrl-s setup prompt message is displayed, if enabled by dim. 00 = 2 seconds (default) 01 = 3 seconds 10 = 5 seconds 11 = 0 seconds note: ctrl-s message is not displayed if 0 seconds prompt time is selected. 5 rsv 0b 0b reserved. 4:3 db 0b 0b default boot selection. these bits select which device is the default boot device. these bits are only used if the agent detects that the bios does not support boot order selection or if the mode field of word 31h is set to mode_legacy. 00 = network boot, then local boot (default) 01 = local boot, then network boot 10 = network boot only 11 = local boot only
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 266 ... .... .... 6.10.6.2 configuration customization options pci function 0 (word 0x31) word 31h of the eeprom contains settings that can be programmed by an oem or network administrator to customize the operation of the software. these settings cannot be changed from within the control-s setup menu. the lower byte contains settings that would typically be configured by a network administrator using an external utility; these settings generally control which setup menu options are changeable. the upper byte is generally settings that would be used by an oem to control the operation of the agent in a lom environment, although there is nothing in the agent to prevent their use on a nic implementation. the default value for this word is 4000h. bit(s) set value: port status clp(combo) executes iscsi boot option rom ctrl-d menu fcoe boot option rom ctrl-d menu 2:0 101-110-111b reserved. same as disabled. 100b fcoe fcoe ?displays port as fcoe. ?allows changing to port to boot disabled, iscsi primary or secondary. ?displays port as fcoe. ?allows changing to boot disabled. 011b iscsi secondary iscsi ?displays port as iscsi secondary. ?allows changing to boot disabled, iscsi primary. ?displays port as iscsi. ?allows changing to boot disabled, fcoe enabled. 010b iscsi primary iscsi ?displays port as iscsi primary. ?allows changing to boot disabled, iscsi secondary. ?displays port as iscsi. ?allows changing to boot disabled, fcoe enabled. 001b boot disabled none ?displays port as disabled. ?allows changing to iscsi primary/secondary. ?displays port as disabled. ?allows changing to fcoe enabled. 000b pxe pxe ?displays port as pxe. ?allows changing to boot disabled, iscsi primary or secondary. ?displays port as pxe. ?allows changing to boot disabled, fcoe enabled. bit(s) name hardware default loaded from eeprom: 0x4000 function 15:14 sig 0x1 0x1 signature. must be set to 01 to indicate that this word has been programmed by the agent or other configuration software. 13 rfu 0b 0b reserved. must be 0. 12 rfu 0b 0b reserved. must be 0.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 267 11 retry 0b 0b selects continuous retry operation. if this bit is set, iba will not transfer control back to the bios if it fails to boot due to a network error (such as failure to receive dhcp replies). instead, it will restart the pxe boot process again. if this bit is set, the only way to cancel pxe boot is for the user to press esc on the keyboard. retry will not be attempted due to hardware conditions such as an invalid eeprom checksum or failing to establish link. default value is 0. 10:8 mode 0b 0b selects the agent?s boot order setup mode. this field changes the agent?s default behavior in order to make it compatible with systems that do not completely support the bbs and pnp expansion rom standards. valid values and their meanings are: 000b - normal behavior. the agent will attempt to detect bbs and pnp expansion rom support as it normally does. 001b - force legacy mode. the agent will not attempt to detect bbs or pnp expansion rom supports in the bios and will assume the bios is not compliant. the user can change the bios boot order in the setup menu. 010b - force bbs mode. the agent will assume the bios is bbs-compliant, even though it may not be detected as such by the agent?s detection code. the user can not change the bios boot order in the setup menu. 011b - force pnp int18 mode. the agent will assume the bios allows boot order setup for pnp expansion roms and will hook interrupt 18h (to inform the bios that the agent is a bootable device) in addition to registering as a bbs ipl device. the user can not change the bios boot order in the setup menu. 100b - force pnp int19 mode. the agent will assume the bios allows boot order setup for pnp expansion roms and will hook interrupt 19h (to inform the bios that the agent is a bootable device) in addition to registering as a bbs ipl device. the user can not change the bios boot order in the setup menu. 101b - reserved for future use. if specified, is treated as a value of 000b. 110b - reserved for future use. if specified, is treated as a value of 000b. 111b - reserved for future use. if specified, is treated as a value of 000b. 7 rfu 0b 0b reserved. must be 0. 6 rfu 0b 0b reserved. must be 0. 5 dfu 0b 0b disable flash update. if this bit is set to 1, the user is not allowed to update the flash image using proset. default value is 0. 4 dlws 0b 0b disable legacy wakeup support. if this bit is set to 1, the user is not allowed to change the legacy os wakeup support menu option. default value is 0. 3 dbs 0b 0b disable boot selection. if this bit is set to 1, the user is not allowed to change the boot order menu option. default value is 0.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 268 6.10.6.3 pxe version (word 0x32) word 32h of the eeprom is used to store the version of the boot agent that is stored in the flash image. when the boot agent loads, it can check this value to determine if any first-time configuration needs to be performed. the agent then updates this word with its version. some diagnostic tools to report the version of the boot agent in the flash also read this word. the format of this word is: 6.10.6.4 iba capabilities (word 0x33) word 33h of the eeprom is used to enumerate the boot technologies that have been programmed into the flash. this is updated by flash configuration tools and is not updated or read by iba. 2 dps 0b 0b disable protocol select. if set to 1, the user is not allowed to change the boot protocol. default value is 0. 1 dtm 0b 0b disable title message. if this bit is set to 1, the title message displaying the version of the boot agent is suppressed; the control-s message is also suppressed. this is for oems who do not wish the boot agent to display any messages at system boot. default value is 0. 0 dsm 0b 0b disable setup menu. if this bit is set to 1, the user is not allowed to invoke the setup menu by pressing control-s. in this case, the eeprom may only be changed via an external program. default value is 0. bit(s) name hardware default loaded from eeprom: 0x1314 function 15 - 12 maj 0x0 0x1 pxe boot agent major version. default value is 0. 11 ? 8 min 0x0 0x3 pxe boot agent minor version. default value is 0. 7 ? 0 bld 0x0 0x14 pxe boot agent build number. default value is 0. bit(s) name hardware default loaded from eeprom: 0x4003 function 15 - 14 sig 0x1 0x1 signature. must be set to 01 to indicate that this word has been programmed by the agent or other configuration software. 13 ? 5 rfu 0b 0b reserved. must be 0. 4 iscsi 0b 0b iscsi boot is present in flash if set to 1. 3 efi 0b 0b efi undi driver is present in flash if set to 1. 2 reserved 0b 0b set to 0. 1 undi 0b 1b pxe undi driver is present in flash if set to 1. 0 bc 0b 1b pxe base code is present in flash if set to 1.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 269 6.10.6.5 setup options pci function 1 (word 0x34) this word is the same as word 30h, but for function 1 of the device. 6.10.6.6 configuration customization options pci function 1 (word 0x35) this word is the same as word 31h, but for function 1 of the device. 6.10.6.7 iscsi option rom version (word 0x36) word 0x36 of the nvm is used to store the version of iscsi option rom updated as the same format as pxe version at word 0x32. the value must be above 0x2000 and the value below (word 0x1fff = 16 kb nvm size) is reserved. iscsiutl, flautil, dmix update iscsi option rom version if the value is above 0x2000, 0x0000, or 0xffff. the value (0x0040 - 0x1fff) should be kept and not be overwritten. 6.10.6.8 setup options pci function 2 (word 0x38) this word is the same as word 30h, but for function 2 of the device. 6.10.6.9 configuration customization options pci function 2 (word 0x39) this word is the same as word 31h, but for function 2 of the device. 6.10.6.10 setup options pci function 3 (word 0x3a) this word is the same as word 30h, but for function 3 of the device. 6.10.6.11 configuration customization options pci function 3 (word 0x3b) this word is the same as word 31h, but for function 3 of the device. 6.10.7 iscsi boot configuration offset (word 0x3d) 6.10.7.1 iscsi module structure bit name description 15:0 offset defines the offset in eeprom where the iscsi boot configuration structure starts. configuration item size in bytes comments iscsi boot signature 2 ?i?, ?s? iscsi block size 2 total byte size of the iscsi configuration block
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 270 structure version 1 version of this structure. should be set to 1. reserved 1 reserved for future use. initiator name 255 + 1 iscsi initiator name. this field is optional and built by manual input, dhcp host name, or with mac address as defined in section 4.4. reserved 34 reserved for future use. below fields are per port. flags 2 bit 00h ? enable dhcp 0 ? use static configurations from this structure 1 ? overrides configurations retrieved from dhcp. bit 01h ? enable dhcp for getting iscsi target information. 0 ? use static target configuration 1 ? use dhcp to get target information by the option 17 root path. bit 02h ? 03h ? authentication type 00 ? none 01 ? one way chap 02 ? mutual chap bit 04h ? 05h ? ctrl-d setup menu 00 ? enabled 03 ? disabled, skip ctrl-d entry bit 06h ? 07h ? reserved bit 08h ? 09h ? arp retries retry value bit 0ah ? 0fh ? arp timeout timeout value for each try initiator ip 4 initiator dhcp flag; not set ? this field should contain the initiator ip address. set ? this field is ignored. subnet mask 4 initiator dhcp flag; not set ? this field should contain the subnet mask. set ? this field is ignored. gateway ip 4 initiator dhcp flag; not set ? this field should contain the gateway ip address. set ? if dhcp bit is set this field is ignored. boot lun 2 target dhcp flag; not set ? iscsi target lun number should be specified. set ? this field is ignored. target ip 4 target dhcp flag; not set ? ip address of iscsi target. set ? this field is ignored.
non-volatile memory map - eeprom ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 271 the maximum amount of boot configuration information that is stored is 834 bytes (417 words); however, the iscsi boot implementation can limit this value in order to work with a smaller eeprom. variable length fields are used to limit the total amount of eeprom that is used for iscsi boot information. each field is preceded by a single byte that indicates how much space is available for that field. for example, if the initiator name field is being limited to 128 bytes, then it is preceded with a single byte with the value of 128. the following field begins at 128 bytes after the beginning of the initiator name field regardless of the actual size of the field. the variable length fields must be null terminated unless they reach the maximum size specified in the length byte. 6.10.8 alternate mac address pointer (word 0x37) this word may point to a location in the eeprom containing additional mac addresses used by system management functions. if the additional mac addresses are not supported, the word shall be set to 0xffff 6.10.9 checksum word (word 0x3f) the checksum word (0x3f) is used to ensure that the base eeprom image is a valid image. the value of this word should be calculated such that after adding all the words (0x00:0x3f), including the checksum word itself, the sum should be 0xbaba. the initial value in the 16-bit summing register should be 0x0000 and the carry bit should be ignored after each addition. note: hardware does not calculate the word 0x3f checksum during eeprom write; it must be calculated by software independently and included in the eeprom write data. hardware does not compute a checksum over words 0x00:0x3f during eeprom reads in order to determine validity of the eeprom image; this field is provided strictly for software verification of eeprom validity. all hardware configurations based on word 0x00:0x3f content is based on the validity of the signature field of eeprom initialization control word 1 ( signature must be 01b). target port 2 target dhcp flag; not set ? tcp port used by iscsi target. default is 3260. set ? this field is ignored. target name 255 + 1 target dhcp flag; not set ? iscsi target name should be specified. set ? this field is ignored. chap password 16 + 2 the minimum chap secret must be 12 octets and maximum chap secret size is 16. the last 2 bytes are null alignment padding. chap user name 127 + 1 the user name must be non-null value and maximum size of user name allowed is 127 characters. reserved 2 reserved mutual chap password 16 + 2 the minimum mutual chap secret must be 12 octets and maximum mutual chap secret size is 16. the last 2 bytes are null alignment padding. reserved 160 reserved for future use.
intel ? 82576eb gbe controller ? non-volatile memory map - eeprom intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 272 6.10.10 image unique id (word 0x42, 0x43) these words contain a unique 32-bit id for each image generated by intel to enable tracking of images and comparison to the original image if testing a customer eeprom image.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 273 7.0 inline functions 7.1 receive functionality 7.1.1 rx queues assignment a received packet goes through three stages of filtering as shown in figure 7-1 . figure 7-1 describes a switch-like structure that is used in virtualization mode to route packets between the network port (top of drawing) and one or more virtual ports (bottom of figure), where each virtual port can be associated with a virtual machine, an iovm, a vmm, or the like. the first step in queue assignment is to make sure that the packet is received by the port. this is done by a set of l2 filters as described in section 7.1.2 . the second stage is specific to virtualization environments and defines the virtual ports (called pools) that are the targets for the rx packet. a packet can be associated with any number of ports/pools and the selection process as described in section 7.1.1.2 . figure 7-1. stages in packet filtering
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 274 in the third stage, a receive packet that successfully passed the rx filters is associated with one of more receive descriptor queues as described in this section. the following filter mechanisms determine the destination of a receive packet. these are described briefly in this section and in full details in separate sections: ? virtualization ? in a virtualized environment, dma resources are shared between more than one software entity (operating system and/or software device driver). this is done by allocating receive descriptor queues to virtual partitions (vmm, iovm, vms, or vfs). allocating queues to virtual partitions is done in sets, each with the same number of queues called queue pools or pools. virtualization assigns to each received packet one or more pool indices. packets are routed to a pool based on their pool index and other considerations, such receive side scaling (rss). see section 7.1.1.2 for details on routing for virtualization. ? rss ? rss distributes packet processing between several processor cores by assigning packets into different descriptor queues. rss assigns to each received packet an rss index. packets are routed to one of a set of rx queues based on their rss index and other considerations such as virtualization. see section 7.1.1.7 for details on rss. ? l2 ethertype filters ? these filters identify packets by their l2 ether type and assign them to receive queues. examples of possible uses are lldp packets and 802.1x packets. see section 7.1.1.4 for mode details. the 82576 incorporates four ether-type filters. ? l3/l4 5-tuple filters ? these filters identify specific l3/l4 flows or sets of l3/l4 flows. each filter consists of a 5-tuple (protocol, source and destination ip addresses, source, and destination tcp/ udp port) and routes packets into one of the rx queues. the 82576 has eight such filters. see section 7.1.1.5 for details. ? tcp syn filters ? the 82576 might route tcp packets with their syn flag set into a separate queue. syn packets are often used in syn attacks to load the system with numerous requests for new connections. by filtering such packets to a separate queue, security software can monitor and act on syn attacks. see section 7.1.1.6 for mode details. typically, packet reception consists of recognizing the presence of a packet on the wire, performing address filtering, storing the packet in the receive data fifo, transferring the data to one of the 16 receive queues in host memory, and updating the state of a receive descriptor. note: maximum supported received-packet size is 9.5 kb (9728 bytes). a received packet is allocated to a queue based on the previous criteria and the following order: ? queue by l2 ether-type filters (if a match) ? if rfctl.synqfp is 0b, then: ? queue by l3/l4 5-tuple filters (if a match) ? queue by syn filter (if a match) ? if rfctl.synqfp is 1b, then: ? queue by syn filter (if a match) ? queue by l3/l4 5-tuple filters (if a match) ? define a pool (in case of virtualization) ? queue by rss. table 7-1 lists the allocation of the queues in each of the modes.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 275 7.1.1.1 queuing in a non-virtualized environment a received packet is assigned to a queue in the following manner: ? l2 ether-type filters ? each filter identifies one of 16 rx queues. ? syn filter ? identifies one of 16 rx queues. ? l3/l4 5-tuple filters ? each filter identifies one of 16 rx queues. ? rss filters - identifies one of 2 x 8 queues through the rss index. the following modes are supported: ? no rss ? the default queue as defined in mrqc.def_q is used for packets that do not meet any of the previous conditions. ? rss ? a set of 16 queues is allocated for rss. the queue is identified through the rss index. note that it is possible to use a subset of the 16 queues. table 7-1. queue allocation 1 1. on top of this allocation, the special filters can override the queueing decision. virtualization rss queue allocation disabled disabled one default queue (mrqc.def_q) enabled up to 16 queues by rss. enabled disabled one queue per vm (queues 0-7 for vm 0-7). enabled two queues per vm (queues 0, 8; 1, 9; 2, 10; 3, 11; 4, 12; 5, 13; 6, 14; 7; 15 for vm 0-7, respectively). spread between the queues by rss.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 276 7.1.1.2 rx queuing in a virtualized environment the 16 rx queues are allocated to a pre-configured number of queue sets called pools. in next generation vmdq mode, system software allocates the pools to the vmm, an iovm, or to vms. in iov mode, each pool is associated with a vf. incoming packets are associated with pools based on their l2 characteristics as described in section 7.10.3 . this section describes the following stage, where an rx queue is assigned to each replication of the rx packet as determined by its pool?s association. figure 7-2. rx queuing flow (non-virtualized)
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 277 a received packet is assigned to a queue within a pool in the following manner: ? l2 ether-type filters ? each filter identifies a specific queue, belonging to some pool (the queue designation determines the pool and is usually allocated to the vmm or a service operating system). ? syn filter ? not supported in vt modes. ? l3/l4 5-tuple filters ? each filter is associated with a single rx queue, belonging to a specific pool. ? rss filters ? the following modes are supported: ? no rss ? a single queue is allocated per pool (queue 0 of each pool). ? rss ? all 16 queues are allocated to pools. note that it is possible to enable rss usage per pool using the vmolr.rsse bit. if the packet is not suitable for rss, then a queue 0 for each pool is used.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 278 figure 7-3. rx queuing flow (virtualization)
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 279 7.1.1.3 queue configuration registers configuration registers (csrs) that control queue operation are replicated per queue (total of 16 copies of each register). each of the replicated registers correspond to a queue such that the queue index equals the serial number of the register (such as register 0 corresponds to queue 0, etc.). registers included in this category are: ? rdbal and rdbah ? rx descriptor base ? rdlen ? rx descriptor length ? rdh ? rx descriptor head ?rdt ? rx descriptor tail ? rxdctl ? receive descriptor control ? rxctl ? rx dca control csrs that define the functionality of descriptor queues are replicated per vf index to allow for a separate configuration in a virtualization environment (total of eight copies of each register). each of the replicated registers correspond to a set of queues with the same vf index, such that the vf index of the queue identifies the serial number of the register. registers included in this category are: ? srrctl ? split and replication receive control ? psrtype ? packet split receive type 7.1.1.4 l2 ether-type filters these filters identify packets by l2 ether-type and assign them to a receive queue. the following usages have been identified: ? ieee 802.1x packets ? extensible authentication protocol over lan (eapol). ? time sync packets (such as ieee 1588) ? identifies sync or delay_req packets the 82576 incorporates eight ether-type filters. the packet type field in the rx descriptor captures the filter number that matched with the l2 ether- type. see section 7.1.5 for decoding of the packet type field. the ether-type filters are configured via the etqf register as follows: ? the etype field contains the 16-bit ether-type compared against all l2 type fields in the rx packet. ? the filter enable bit enables identification of rx packets by ether-type according to this filter. if this bit is cleared, the filter is ignored for all purposes. ? the rx queue field contains the absolute destination queue for the packet. ? the 1588 time stamp field indicates that the packet should be time stamped according to the ieee 1588 specification. ? the queue enable field enables forwarding rx packets based on the ether-type defined in this register. special considerations for virtualization modes: ? packets that match an ether-type filter are diverted from their original pool (the pool identified by the l2 filters) to the pool used as the pool to which the queue in the queue field belongs. in other words, the l2 filters are ignored in determining the pool for such packets. ? the same applies for multi-cast packets. a single copy is posted to the pool defined by the filter.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 280 ? mirroring rules: ? if a pool is being mirrored, the pool to which the queue in the queue field belongs is used to determine if a packet that matches the filter should be mirrored. ? the queue inside the pool (indicated by the queue field) is used for both the original pool and the mirroring pool. 7.1.1.5 l3/l4 5-tuple filters these filters identify specific l3/l4 flows or sets of l3/l4 flows. each filter consists of a 5-tuple (protocol, source and destination ip addresses, source and destination tcp/udp port) and forwards packets into one of the rx queues. in a virtualized environment, each filter can be associated with one specific vf and a packet must match the l2 conditions for that vf to match the 5-tuple filter. note: on fragmented packets, tcp/udp headers are not parsed, so source port and destination port fields will not match. if a filter requires matches for source/destination ports, then fragmented packets will not match using that filter. if a filter bypasses (ignores) the source port, destination port, and control bits; it can still be used to filter the protocol, source address and destination address. the 82576 incorporates eight such filters. the 5-tuple filters are configured via the ftqf, spqf, imir, imir_ext, daqf & saqf registers as follows (per filter): ? protocol ? identifies the ip protocol, part of the 5-tuple queue filters. enabled by a bit in the mask field. ? source address ? identifies the ip source address, part of the 5-tuple queue filters. enabled by a bit in the mask field. only ipv4 addresses are supported. ? destination address ? identifies the ip destination address, part of the 5-tuple queue filters. enabled by a bit in the mask field. only ipv4 addresses are supported. ? source port ? identifies the tcp/udp source port, part of the 5-tuple queue filters. enabled by a bit in the mask field. ? destination port ? identifies the tcp/udp destination port, part of the 5-tuple queue filters. enabled if the imir.port_bp field is cleared. ? size threshold ? identifies the length of the packet that should trigger the filter. this is the length as received by the host, not including any part of the packet removed by hardware. enabled by the size_bp field. ? control bits ? identify tcp flags that might be part of the filtering process. enabled by the ctrlbit_bp field. ? rx queue ? determines the rx queue for packets that match this filter. only the lsb bits are used: ? in a non-virtualized configuration, the rx queue field contains the queue serial number. ? in the virtualized configuration, the rx queue field contains the queue serial number within the set of queues of the vf associated (via the vf field) with this filter. in this case, the packet is sent to all vfs in the vf index list (see section 7.1.1.2 for details) in the queue defined in the filter. ? queue enable ? enables forwarding a packet that uses this filter. ? vf ? identifies the vf associated with this filter by its vf index (virtualization modes only). a packet must match the vf filters (such as mac address) and the 5-tuple filter for this filter to apply. note: the above field should not be set to match a mirror port (such as a port that receives promiscuous traffic), as it influences the queuing of packets sent to mirrored port.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 281 ? vf mask ? determines if the vf field participates in the 5-tuple match or is ignored: ? must be set to 1b in non-virtualized case ? in a virtualized configuration: ? when set to 0b, only unicast packets that match the vf field are candidates for this filter. ? when set to 1b, unicast, multicast, and broadcast packets might all match with the 5-tuple filter. vf association is not checked. the rx queue field defines a queue for each vf. ? mask ? a5-bit field that masks each of the fields in the 5-tuple (l4 protocol, ip addresses, tcp/udp ports). the filter is a logical and of the non-masked 5-tuple fields. if all 5-tuple fields are masked, the filter is not used for queue forwarding. note: if more than one 5-tuple filter with the same priority are matched by the packet, the first filter (lowest ordinal number) is used in order to define the queue destination of this packet. the immediate interrupt and 1588 actions are defined by the or of all the matching filters. filtering rules for ipv6 packets are: ? if a filter defines at least one of the ip source and destination addresses, then an ipv6 packet always misses such a filter. ? if a filter masks both the ip source and destination addresses, then an ipv6 packet is compared against the remaining fields of the filter. ? tunnelled packets are not matched by the 5-tuple filters. note: these filters are not available for vm to vm traffic forwarding. 7.1.1.6 syn packet filters the 82576 might forward tcp packets whose syn flag is set into a separate queue. syn packets are often used in syn attacks to load the system with numerous requests for new connections. by filtering such packets to a separate queue, security software can monitor and act on syn attacks. syn filters are configured via the synqf registers as follows: ? queue en ? enables forwarding of syn packets to a specific queue. ? rx queue field ? contains the destination queue for the packet. this filter is not to be used in a virtualized environment. 7.1.1.7 receive-side scaling (rss) rss is a mechanism to distribute received packets into several descriptor queues. software then assigns each queue to a different processor, sharing the load of packet processing among several processors. as described in section 7.1.1.7 , the 82576 uses rss as one ingredient in its packet assignment policy (the others are the various filters and virtualization). the rss output is a rss index. the 82576?s global assignment uses these bits (or only some of the lsbs) as part of the queue number. rss is enabled in the mrqc register. the rss status field in the descriptor write-back is enabled when the rxcsum.pcsd bit is set (fragment checksum is disabled). rss is therefore mutually exclusive with udp fragmentation. also, support for rss is not provided when legacy receive descriptor format is used.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 282 when rss is enabled, the 82576 provides software with the following information as required by microsoft* rss or for device driver assistance: ? a dword result of the microsoft* rss hash function, to be used by the stack for flow classification, is written into the receive packet descriptor (required by microsoft* rss). ? a 4-bit rss type field conveys the hash function used for the specific packet (required by microsoft* rss). figure 7-4 shows the process of computing an rss output: 1. the receive packet is parsed into the header fields used by the hash operation (such as ip addresses, tcp port, etc.). 2. a hash calculation is performed. the 82576 supports a single hash function, as defined by microsoft* rss. the 82576 does not indicate to the software device driver which hash function is used. the 32-bit result is fed into the packet receive descriptor. 3. the seven lsbs of the hash result are used as an index into a 128-entry indirection table. each entry provides a 3-bit rss output index. when rss is disabled, packets are assigned an rss output index = zero. system software might enable or disable rss at any time. while disabled, system software might update the contents of any of the rss-related registers. when multiple requests queues are enabled in rss mode, un-decodable packets are assigned an rss output index = zero. the 32-bit tag (normally a result of the hash function) equals zero.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 283 7.1.1.7.1 rss hash function section 7.1.1.7.1 provides a verification suite used to validate that the hash function is computed according to microsoft* nomenclature. the 82576 hash function follows microsoft* definition. a single hash function is defined with several variations for the following cases: ? tcpipv4 ? the 82576 parses the packet to identify an ipv4 packet containing a tcp segment per the criteria described later in this section. if the packet is not an ipv4 packet containing a tcp segment, rss is not done for the packet. ? ipv4 ? the 82576 parses the packet to identify an ipv4 packet. if the packet is not an ipv4 packet, rss is not done for the packet. ? tcpipv6 ? the 82576 parses the packet to identify an ipv6 packet containing a tcp segment per the criteria described later in this section. if the packet is not an ipv6 packet containing a tcp segment, rss is not done for the packet. figure 7-4. rss block diagram
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 284 ? tcpipv6ex ? the 82576 parses the packet to identify an ipv6 packet containing a tcp segment with extensions per the criteria described later in this section. if the packet is not an ipv6 packet containing a tcp segment, rss is not done for the packet. extension headers should be parsed for a home-address-option field (for source address) or the routing-header-type-2 field (for destination address). ? ipv6ex ? the 82576 parses the packet to identify an ipv6 packet. extension headers should be parsed for a home-address-option field (for source address) or the routing-header-type-2 field (for destination address). note that the packet is not required to contain any of these extension headers to be hashed by this function. in this case, the ipv6 hash is used. if the packet is not an ipv6 packet, rss is not done for the packet. ? ipv6 ? the 82576 parses the packet to identify an ipv6 packet. if the packet is not an ipv6 packet, receive-side-scaling is not done for the packet. the following additional cases are not part of the microsoft* rss specification: ? udpipv4 ? the 82576 parses the packet to identify a packet with udp over ipv4. ? udpipv6 ? the 82576 parses the packet to identify a packet with udp over ipv6. ? udpipv6ex ? the 82576 parses the packet to identify a packet with udp over ipv6 with extensions. a packet is identified as containing a tcp segment if all of the following conditions are met: ? the transport layer protocol is tcp (not udp, icmp, igmp, etc.). ? the tcp segment can be parsed (such as ip options can be parsed, packet not encrypted). ? the packet is not fragmented (even if the fragment contains a complete tcp header). bits[31:16] of the multiple receive queues command (mrqc) register enable each of the above hash function variations (several can be set at a given time). if several functions are enabled at the same time, priority is defined as follows (skip functions that are not enabled): ipv4 packet: 1. try using the tcpipv4 function. 2. try using ipv4_udp function. 3. try using the ipv4 function. ipv6 packet: 1. if tcpipv6ex is enabled, try using the tcpipv6ex function; else if tcpipv6 is enabled try using the tcpipv6 function. 2. if udpipv6ex is enabled, try using udpipv6ex function; else if updipv6 is enabled try using udpipv6 function. 3. if ipv6ex is enabled, try using the ipv6ex function, else if ipv6 is enabled, try using the ipv6 function. the following combinations are currently supported: ? any combination of ipv4, tcpipv4, and udpipv4. ? and/or. ? any combination of either ipv6, tcpipv6, and udpipv6 or ipv6ex, tcpipv6ex, and udpipv6ex. when a packet cannot be parsed by the previously mentioned rules, it is assigned an rss output index = zero. the 32-bit tag (normally a result of the hash function) equals zero.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 285 the 32-bit result of the hash computation is written into the packet descriptor and also provides an index into the indirection table. the following notation is used to describe the hash functions: ? ordering is little endian in both bytes and bits. for example, the ip address 161.142.100.80 translates into 0xa18e6450 in the signature. ? a ?^ ?denotes bit-wise xor operation of same-width vectors. ? @x-y denotes bytes x through y (including both of them) of the incoming packet, where byte 0 is the first byte of the ip header. in other words, it is considered that all byte-offsets as offsets into a packet where the framing layer header has been stripped out. therefore, the source ipv4 address is referred to as @12-15, while the destination v4 address is referred to as @16-19. ? @x-y, @v-w denotes concatenation of bytes x-y, followed by bytes v-w, preserving the order in which they occurred in the packet. all hash function variations (ipv4 and ipv6) follow the same general structure. specific details for each variation are described in the following section. the hash uses a random secret key length of 320 bits (40 bytes); the key is typically supplied through the rss random key register (rssrk). the algorithm works by examining each bit of the hash input from left to right. intel?s nomenclature defines left and right for a byte-array as follows: given an array k with k bytes, intel?s nomenclature assumes that the array is laid out as shown: k[0] k[1] k[2] ? k[k-1] k[0] is the left-most byte, and the msb of k[0] is the left-most bit. k[k-1] is the right-most byte, and the lsb of k[k-1] is the right-most bit. computehash(input[], n) for hash-input input[] of length n bytes (8n bits) and a random secret key k of 320 bits result = 0; for each bit b in input[] { if (b == 1) then result ^= (left-most 32 bits of k); shift k left 1 bit position; } return result; the following four pseudo-code examples are intended to help clarify exactly how the hash is to be performed in four cases, ipv4 with and without ability to parse the tcp header and ipv6 with an without a tcp header. 7.1.1.7.1.1 hash for ipv4 with tcp concatenate sourceaddress, destinationaddress, sourceport, destinationport into one single byte- array, preserving the order in which they occurred in the packet: input[12] = @12-15, @16-19, @20-21, @22-23. result = computehash(input, 12); 7.1.1.7.1.2 hash for ipv4 with udp concatenate sourceaddress, destinationaddress, sourceport, destinationport into one single byte- array, preserving the order in which they occurred in the packet: input[12] = @12-15, @16-19, @20-21, @22-23. result = computehash(input, 12);
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 286 7.1.1.7.1.3 hash for ipv4 without tcp concatenate sourceaddress and destinationaddress into one single byte-array input[8] = @12-15, @16-19 result = computehash(input, 8) 7.1.1.7.1.4 hash for ipv6 with tcp similar to above: input[36] = @8-23, @24-39, @40-41, @42-43 result = computehash(input, 36) 7.1.1.7.1.5 hash for ipv6 with udp similar to above: input[36] = @8-23, @24-39, @40-41, @42-43 result = computehash(input, 36) 7.1.1.7.1.6 hash for ipv6 without tcp input[32] = @8-23, @24-39 result = computehash(input, 32) 7.1.1.7.2 indirection table the indirection table is a 128-entry structure, indexed by the seven lsbs of the hash function output. each entry of the table contains the following: ? bits [3:0] ? rss index note: in rss mode, all bits are used. in next generation vmdq + rss mode only bit 0 is used. system software might update the indirection table during run time. such updates of the table are not synchronized with the arrival time of received packets. therefore, it is not guaranteed that a table update takes effect on a specific packet boundary. 7.1.1.7.3 rss verification suite assume that the random key byte-stream is: 0x6d, 0x5a, 0x56, 0xda, 0x25, 0x5b, 0x0e, 0xc2, 0x41, 0x67, 0x25, 0x3d, 0x43, 0xa3, 0x8f, 0xb0, 0xd0, 0xca, 0x2b, 0xcb, 0xae, 0x7b, 0x30, 0xb4, 0x77, 0xcb, 0x2d, 0xa3, 0x80, 0x30, 0xf2, 0x0c, 0x6a, 0x42, 0xb7, 0x3b, 0xbe, 0xac, 0x01, 0xfa
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 287 7.1.1.7.3.1 ipv4 7.1.1.7.3.2 ipv647 the ipv6 address tuples are only for verification purposes and might not make sense as a tuple. 7.1.1.7.4 association through mac address each of the 24 mac address filters can be associated with a vf/vm. the vind field in the receive address high (rah) register determines the target vm. packets that do not match any of the mac filters (such as promiscuous) are assigned with the default vt. software can program different values to the mac filters (any bits in rah or ral) at any time. the 82576 would respond to the change on a packet boundary but does not guarantee the change to take place at some precise time. 7.1.2 l2 packet filtering the receive packet filtering role is to determine which of the incoming packets are allowed to pass to the local system and which of the incoming packets should be dropped since they are not targeted to the local system. received packets can be destined to the host, to a manageability controller (bmc), or to both. this section describes how host filtering is done, and the interaction with management filtering. as shown in figure 7-5 , host filtering has three stages: 1. packets are filtered by l2 filters (mac address, unicast/multicast/broadcast). see section 7.1.2.1 for details. 2. packets are then filtered by vlan if a vlan tag is present. see section 7.1.2.2 for details. table 7-2. ipv4 destination address/port source address/port ipv4 only ipv4 with tcp 161.142.100.80:1766 66.9.149.187:2794 0x323e8fc2 0x51ccc178 65.69.140.83:4739 199.92.111.2:14230 0xd718262a 0xc626b0ea 12.22.207.184:38024 24.19.198.95:12898 0xd2d0a5de 0x5c2b394a 209.142.163.6:2217 38.27.205.30:48228 0x82989176 0xafc7327f 202.188.127.2:1303 153.39.163.191:44251 0x5d1809c5 0x10e828a2 table 7-3. ipv6 destination address/port source address/port ipv6 only ipv6 with tcp 3ffe:2501:200:3::1 (1766) 3ffe:2501:200:1fff::7 (2794) 0x2cc18cd5 0x40207d3d ff02::1 (4739) 3ffe:501:8::260:97ff:fe40:efab (14230) 0x0f0c461c 0xdde51bbf fe80::200:f8ff:fe21:67cf (38024) 3ffe:1900:4545:3:200:f8ff:fe21:6 7cf (44251) 0x4b61e985 0x02d1feef
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 288 3. packets are filtered by the manageability filters (port, ip, flex, other). see section 10.4.1 for details. a packet is not forwarded to the host if any of the following takes place: 1. the packet does not pass mac address filters as described later in this section. ? the packet does not pass vlan filtering as described later in this section. ? the packet passes manageability filtering and then the manageability filters determine that the packet should not pass to host as well (see section 10.4.1 and the manc2h register). a packet that passes receive filtering as previously described might still be dropped due to other reasons. normally, only good packets are received. these are defined as those packets with no under size error, over size error, packet error, length error and crc error are detected. however, if the storebad-packet bit is set ( fctrl.sbp ), then bad packets that pass the filter function are stored in host memory. packet errors are indicated by error bits in the receive descriptor ( rdesc.errors ). it is possible to receive all packets, regardless of whether they are bad, by setting the promiscuous enables and the store-bad-packet bit. if there is insufficient space in the receive fifo, hardware drops the packet and indicates the missed packet in the appropriate statistics registers. note: crc errors before the sfd are ignored. any packet must have a valid sfd in order to be recognized by the 82576 (even bad packets).
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 289 7.1.2.1 mac address filtering figure 7-6 shows the mac address filtering. a packet passes successfully through the mac address filtering if any of the following conditions are met: 1. it is a unicast packet and promiscuous unicast filtering is enabled. 2. it is a multicast packet and promiscuous multicast filtering is enabled. 3. it is a unicast packet and it matches one of the unicast mac filters (host or manageability). 4. it is a multicast packet and it matches one of the multicast filters. figure 7-5. rx filtering flow chart
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 290 5. it is a broadcast packet and broadcast accept mode (bam) is enabled. note that in this case, for manageability traffic, the packet does not go through vlan filtering (vlan filtering is assumed to match). 7.1.2.1.1 unicast filter the entire mac address is checked against the 24 host unicast addresses and four management unicast addresses (if enabled). the 24 host unicast addresses are controlled by the host interface (the mc must not change them). the other four addresses are dedicated to management functions and are only figure 7-6. mac address rx filtering flow chart
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 291 accessed by the bmc. the destination address of incoming packet must exactly match one of the pre- configured host address filters or the manageability address filters. these addresses can be unicast or multicast. those filters are configured through ral, rah, mmal, and mmah registers. promiscuous unicast ? receive all unicasts. promiscuous unicast mode can be set/cleared only through the host interface (not by the bmc) and it is usually used when the 82576 is used as a sniffer. unicast hash table ? destination address matching the unicast hash table (uta). 7.1.2.1.2 multicast filter (partial) the 12-bit portion of incoming packet multicast address must exactly match multicast filter address (mfa) in order to pass multicast filtering. those 12 bits out of 48 bits of the destination address can be selected by the mo field of rctl ( section 8.10.1 ). these entries can be configured only by the host interface and cannot be controlled by the bmc. packets received according to this mode have the pif bit in the descriptor set to indicate imperfect filtering that should be validated by the software device driver. promiscuous multicast ? receive all multicast. promiscuous multicast mode can be set/cleared only through the host interface (not by the bmc) and it is usually used when the 82576 is used as a sniffer. note: when the promiscuous bit is set and a multicast packet is received, the pif bit of the packet status is not set. 7.1.2.2 vlan filtering a receive packet that successfully passed mac address filtering is then subjected to vlan header filtering. 1. if the packet does not have a vlan header, it passes to the next filtering stage. note: if extended vlan is enabled ( ctrl_ext.extended_vlan is set), it is assumed that the first vlan tag is an extended vlan and it is skipped. all next stages refer to the second vlan. 2. if the packet has a vlan header and it passes a valid manageability vlan filter, then is passes to the next filtering stage. 3. if vlan filtering is disabled ( rctl.vfe bit is cleared), the packet is forwarded to the next filtering stage. 4. if the packet has a vlan header, and it matches an enabled host vlan filter, the packet is forwarded to the next filtering stage. 5. if the packet has a vlan header and manc.bypass vlan is set, the packet is forwarded to the next filtering stage, but is candidate for manageability forwarding only. 6. otherwise, the packet is dropped. figure 7-7 shows the vlan filtering flow.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 292 7.1.2.3 manageability filtering manageability filtering is described in chapter 10.4.1 . figure 7-8 shows the manageability portion of the packet filtering and it is brought here to make the receive packet filtering functionality description complete. figure 7-7. vlan filtering
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 293 note: the manageability engine might decide to snoop or redirect part of the received packets, according to the external mc instructions and the eeprom settings. figure 7-8. manageability filtering
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 294 7.1.3 receive data storage 7.1.3.1 host buffers each descriptor points to a one or more memory buffers that are designated by the software device driver to store packet data. the size of the buffer can be set using either the generic rctl.bsize field, or the per queue srrctl[n].bsizepacket field. the receive buffer size is selected by bit settings in the receive control (rctl.bsize). the register supports buffer sizes of 256, 512, 1024, and 2048 bytes. see section 12.7.1 for details. if srrctl[n].bsizepacket is set to zero for any queue, the buffer size defined by rctl.bsize is used. otherwise, the buffer size defined by srrctl[n].bsizepacket is used. for advanced descriptor usage, the srrctl.bsizeheader field is used to define the size of the buffers allocated to headers. the maximum buffer size supported is 960 bytes. the 82576 places no alignment restrictions on receive memory buffer addresses. this is desirable in situations where the receive buffer was allocated by higher layers in the networking software stack, as these higher layers might have no knowledge of a specific device's buffer alignment requirements. note: when the no-snoop enable bit is used in advanced descriptors, the buffer address is 16-bit (2-byte) aligned. 7.1.3.2 on-chip rx buffers the 82576 contains a 64 kbytes packet buffer that can be used to store packets until they are forwarded to the host. in addition, to support the forwarding of local packets as described in section 7.10.3 , a switch buffer of 20 kbytes is provided. this buffer serves as a receive buffer for all the local traffic. 7.1.3.3 on-chip descriptor buffers the 82576 contains a 32 descriptor cache for each receive queue used to reduce the latency of packet processing and to optimize the usage of pcie bandwidth by fetching and writing back descriptors in bursts. the fetch and writeback algorithm are described in section 7.1.6 and section 7.1.7 . 7.1.4 legacy receive descriptor format a receive descriptor is a data structure that contains the receive data buffer address and fields for hardware to store packet information. if srrctl[n],desctype = 000b, the 82576 uses the legacy rx descriptor as shown in table 7-4 . the shaded areas indicate fields that are modified by hardware upon packet reception (so-called descriptor write-back). note: legacy descriptor must not be used when advanced features such as virtualization or security features are activated.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 295 after receiving a packet for the 82576, hardware stores the packet data into the indicated buffer and writes the length, packet checksum, status, errors, and status fields. packet buffer address (64) ? physical address of the packet buffer. length field (16) length covers the data written to a receive buffer including crc bytes (if any). software must read multiple descriptors to determine the complete length for a packet that spans multiple receive buffers. fragment checksum (16) this field is used to provide the fragment checksum value. this field equals to the unadjusted 16-bit ones complement of the packet. checksum calculation starts at the l4 layer (after the ip header) until the end of the packet excluding the crc bytes. in order to use the fragment checksum assist to offload l4 checksum verification, software might need to back out some of the bytes in the packet. for more details see section 7.1.10.2 status field (8) status information indicates whether the descriptor has been used and whether the referenced buffer is the last one for the packet. see table 7-5 for the layout of the status field. error status information is shown in figure 7-9 . ? pif (bit 7) ? passed in-exact filter ? ipcs (bit 6) ? ipv4 checksum calculated on packet ? l4cs (bit 5) ? l4 (udp or tcp) checksum calculated on packet ? udpcs (bit 4) ? udp checksum calculated on packet ? vp (bit 3) ? packet is 802.1q (matched vet); indicates strip vlan in 802.1q packet ? rsv (bit 2) ? reserved ? eop (bit 1) ? end of packet ? dd (bit 0) ? descriptor done eop and dd table 7-4. legacy receive descriptor (rdesc) layout 63 48 47 40 39 32 31 16 15 0 0 buffer address [63:0] 8 vlan tag errors status fragment checksum length table 7-5. receive status (rdesc.status) layout 7 6 5 4 3 2 1 0 pif ipcs l4cs udpcs vp rsv eop dd
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 296 the following table lists the meaning of these bits: vp field the vp field indicates whether the incoming packet's type matches vet. for example, if the packet is a vlan (802.1q) type, it is set if the packet type matches vet and ctrl.vme is set. it also indicates that vlan has been stripped in 802.1q packet. for more details, see section 7.4 . ipcs (ipv4 checksum), l4cs (l4 checksum), and udpcs (udp checksum) the meaning of these bits is shown in the table below: refer to table 7-18 for a description of supported packet types for receive checksum offloading. unsupported packet types do not have the ipcs or l4cs bits set. ipv6 packets do not have the ipcs bit set, but might have the l4cs bit set if the 82576 recognized the tcp or udp packet. pif hardware supplies the pif field to expedite software processing of packets. software must examine any packet with pif set to determine whether to accept the packet. if pif is clear, then the packet is known to be for this station, so software need not look at the packet contents. multicast packets passing only the multicast vector (mta) or unicast packets passing only the unicast hash table (uta) but not any of the mac address exact filters (rah, ral) have pif set. in addition, the following condition causes pif to be cleared: ? the da of the packet is a multicast address and promiscuous multicast is set (rctl.mpe = 1b). ? the da of the packet is a broadcast address and accept broadcast mode is set (rctl.bam = 1b) a mac control frame forwarded to the host (rctl.pmcf = 0b) that does not match any of the exact filters, has the pif bit set. table 7-6. receive status bits dd eop description 0b 0b software setting of the descriptor when it hands it off to the hardware. 0b 1b reserved (invalid option). 1b 0b a completion status indication for a non-last descriptor of a packet that spans across multiple descriptors. in a single packet case, dd indicates that the hardware is done with the descriptor and its buffers. only the length fields are valid on this descriptor. 1b 1b a completion status indication of the entire packet. note that software might take ownership of its descriptors. all fields in the descriptor are valid (reported by the hardware). table 7-7. ipcs, l4cs, and udpcs l4cs udpcs ipcs functionality 0b 0b 0b hardware does not provide checksum offload. special case: hardware does not provide udp checksum offload for ipv4 packet with udp checksum = 0b 1b 0b 1b / 0b hardware provides ipv4 checksum offload if ipcs is active and tcp checksum is offload. a pass/fail indication is provided in the error field ? ipe and l4e. 0b 1b 1b / 0b hardware provides ipv4 checksum offload if ipcs is active and udp checksum is offload. a pass/fail indication is provided in the error field ? ipe and l4e.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 297 error field (8) most error information appears only when the store-bad-packet bit (rctl.sbp) is set and a bad packet is received. see figure 7-9 for a definition of the possible errors and their bit positions. ? rxe (bit 7) ? rx data error ? ipe (bit 6) ? ipv4 checksum error ? l4e (bit 5) ? tcp/udp checksum error ? reserved (bit 4:0) ipe/l4e the ip and tcp checksum error bits from figure 7-9 are valid only when the ipv4 or tcp/udp checksum(s) is performed on the received packet as indicated via ipcs and l4cs. these, along with the other error bits, are valid only when the eop and dd bits are set in the descriptor. note: receive checksum errors have no affect on packet filtering. if receive checksum offloading is disabled (rxcsum.ipofl and rxcsum.tuofl), the ipe and l4e bits are 0b. rxe the rxe error bit is asserted in one of two cases (software might distinguish between these errors by monitoring the respective statistics registers): 1. crc error is detected. crc can be a result of reception of /v/ symbol on the tbi interface (see section 3.5.3.3.2 ) or assertion of rxerr on the mii/gmii interface or bad eop or lose of sync during packet reception. packets with a crc error are posted to host memory only when store-bad- packet bit (rctl.sbp) is set. 2. hardware checks the data integrity when received packets are fetched from its internal packet buffer (see section 7.6 for details). packets with integrity errors are posted to host memory regardless of store-bad-packet setting (rctl.sbp). vlan tag field (16) hardware stores additional information in the receive descriptor for 802.1q packets. if the packet type is 802.1q (determined when a packet matches vet and ctrl.vme = 1b), then the vlan tag field records the vlan information and the four-byte vlan information is stripped from the packet data storage. otherwise, the vlan tag field contains 0x0000. the rule for vlan tag is to use network ordering (also called big endian). it appears in the following manner in the descriptor: table 7-8. rxe, lpe, l4e 7 6 5 4 3 2 1 0 rxe ipe l4e reserved table 7-9. vlan tag field layout (for 802.1q packet) 15 13 12 11 0 pri cfi vlan
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 298 7.1.5 advanced receive descriptors 7.1.5.1 advanced receive descriptors (read format) figure 7-10 shows the receive descriptor. this is the format that software writes to the descriptor queue and hardware reads from the descriptor queue in host memory. hardware writes back the descriptor in a different format, shown in table 7-10 . packet buffer address (64) ? physical address of the packet buffer. the lowest bit is either a0 (lsb of address) or nse (no-snoop enable), depending on bit rxctl.rxdatawritensen of the relevant queue. see section 8.13.1 . header buffer address (64) ? physical address of the header buffer. the lowest bit is dd. note: the 82576 does not support null descriptors (a packet or header address is always equal to zero. when software sets the nse bit, the 82576 places the received packet associated with this descriptor in memory at the packet buffer address with nse set in the pcie attribute fields. nse does not affect the data written to the header buffer address. when a packet spans more than one descriptor, the header buffer address is not used for the second, third, etc. descriptors; only the packet buffer address is used in this case. nse is enabled for packet buffers that the software device driver knows have not been touched by the processor since the last time they were used, so the data cannot be in the processor cache and snoop is always a miss. avoiding these snoop misses improves system performance. no-snoop is particularly useful when the dma engine is moving the data from the packet buffer into application buffers, and the software device driver is using the information in the header buffer for its work with the packet. note: when no-snoop enable is used, relaxed ordering should also be enabled with ctrl_ext.ro_dis. 7.1.5.2 advanced receive descriptors ? writeback format when the 82576 writes back the descriptors, it uses the descriptor format shown in table 7-11 . note: srrctl[n]. desctype must be set to a value other than 000b for the 82576 to write back the special descriptors. table 7-10. descriptor read format 63 1 0 0 packet buffer address [63:1] a0/nse 8 header buffer address [63:1] dd table 7-11. descriptor write-back format 63 48 47 35 34 32 31 30 21 20 19 17 16 4 3 0 0 rss hash value/fragment checksum and ip identification sph hdr_len rsv packet type rss type 8 vlan tag pkt_len extended error extended status
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 299 rss type (4) the 82576 must identify the packet type and then choose the appropriate rss hash function to be used on the packet. the rss type reports the packet type that was used for the rss hash function. packet type (13) ? vpkt (bit 12) ? vlan packet indication the 12 lsb bits of the packet type reports the packet type identified by the hardware as follows: table 7-12. rss type packet type description 0x0 no hash computation done for this packet. 0x1 hash_tcp_ipv4 0x2 hash_ipv4 0x3 hash_tcp_ipv6 0x4 hash_ipv6_ex 0x5 hash_ipv6 0x6 hash_tcp_ipv6_ex 0x7 hash_udp_ipv4 0x8 hash_udp_ipv6 0x9 hash_udp_ipv6_ex 0xa:0xf reserved table 7-13. lsb bits bit index bit 11 = 0b bit 11 = 1b (l2 packet) 0 ipv4 ? ipv4 header present ethertype ? etqf register index that matches the packet. special types might be defined for 1588, 802.1x, or any other requested type. 1 ipv4e ? ipv4 header includes extensions 2 ipv6 ? ipv6 header present 3 ipv6e- ipv6 header includes extensions reserved ? for future expansion of etqf 4 tcp ? tcp header present 5 udp ? udp header present reserved 6 sctp ? sctp header present reserved 7 nfs ? nfs header present reserved 8 ipsec esp ? ipsec encapsulation 1 reserved 9 ipsec ah ? ipsec encapsulation reserved 10 macsec ? macsec encapsulation macsec ? macsec encapsulation
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 300 rsv(5) ? reserved. hdr_len (10) ? the length (bytes) of the header as parsed by the 82576. in split mode when hbo is set, the hdr_len can be greater then zero though nothing is written to the header buffer. in header replication mode (sph is also set in this mode), this does not reflect the size of the data actually stored in the header buffer because the 82576 fills the buffer up to the size configured by srrctl[n].bsizeheader, which might be larger than the header size reported here. this field is only valid in the first descriptor of a packet and should be ignored in all subsequent descriptors. packet types supported by the header split and header replication are listed later. other packet types are posted sequentially in the host packet buffer. each line in the following table has an enable bit in the psrtype register. when one of the bits is set, the corresponding packet type is split. if the bit is not set, a packet matching the header layout is not split. header split and replication is described in section 7.1.9 while the packet types for this functionality are enabled by the psrtype[n] registers ( section 8.10.3 ). note: the header of a fragmented ipv6 packet is defined before the fragmented extension header. sph (1) ? split header ? when set, indicates that the hdr_len field reflects the length of the header found by hardware. if cleared, the hdr_len field should be ignored, unless in split ? always use header buffer mode, where pkt_len = 0, in which case, the hdr_len reflects the size of the packet, even if sph is cleared. rss hash / fragment checksum (32) this field has multiplexed functionality according to the received packet type (reported on the packet type field in this descriptor) and device setting. fragment checksum (16-bit; 63:48) the fragment checksum word contains the unadjusted one?s complement checksum of the ip payload and is used to offload checksum verification for fragmented udp packets as described in section 7.1.10.2 . this field is mutually exclusive with the rss hash. it is enabled when the rxcsum.pcsd bit is cleared and the rxcsum.ippcse bit is set. ip identification (16-bit; 47:32) the ip identification word identifies the ip packet to whom this fragment belongs and is used to offload checksum verification for fragmented udp packets as described in section 7.1.10.2 . this field is mutually exclusive with the rss hash. it is enabled when the rxcsum.pcsd bit is cleared and the rxcsum.ippcse bit is set. rss hash value (32) the rss hash value is required for rss functionality as described in section 7.1.1.7 . this bit is mutually exclusive with the ip identification and the fragment checksum. it is enabled when the rxcsum.pcsd bit is set. extended status (20) 1. ipsec functionality not available in 82576ns.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 301 status information indicates whether the descriptor has been used and whether the referenced buffer is the last one for the packet. table 7-14 lists the extended status word in the last descriptor of a packet ( eop is set). table 7-15 lists the extended status word in any descriptor but the last one of a packet ( eop is cleared). ts (16) ? time stamped packet (time sync). the time stamp bit is set when the device recognized a time sync packet. in such a case the hardware captures its arrival time and stores it in the ?time stamp? register. note: if tsyncrxctl.type=100b, all the packets are time stamped; however, this bit is never set as the time stamp value is not locked. reserved (2, 8, 15:13, 19) ? reserved at zero. pif (7), ipcs(6), udpcs(4), vp(3), eop (1), dd (0) ? these bits are described in the legacy descriptor format in section 7.1.4 . l4i (5) ? this bit indicates that an l4 integrity check was done on the packet, either tcp checksum, udp checksum or sctp crc checksum. this bit is valid only for the last descriptor of the packet. an error in the integrity check is indicated by the l4e bit in the error field. the type of check done can be induced from the packet type bits 4, 5 and 6. if bit 4 is set, a tcp checksum was done. if bit 5 is set a udp checksum was done, and if bit 6 is set, a crc checksum was done. vext (9) - first vlan is found on a double vlan packet. this bit is valid only when ctrl_ext.extended_vlan is set. for more details see section 7.4.5 . udpv (10) ? this bit indicates that the incoming packet contains a valid (non-zero value) checksum field in an incoming fragmented udp ipv4 packet. this means that the fragment checksum field in the receive descriptor contains the udp checksum as described in section 7.1.10.2 . when this field is cleared in the first fragment that contains the udp header, means that the packet does not contain a valid udp checksum and the checksum field in the rx descriptor should be ignored. this field is always cleared in incoming fragments that do not contain the udp header. llint (11) ? this bit indicates that the packet caused an immediate interrupt via the low latency interrupt mechanism. secp (17) ? the security processing bit indicates that hardware identified the security encapsulation and processed it as configured. table 7-14. receive status (rdesc.status) layout of the last descriptor 19 18 17 16 15 14 13 12 11 10 rsv lb secp ts reserved strip crc llint udpv vext rsv pif ipcs l4i udpcs vp rsv eop dd 9 87 6 54321 0 table 7-15. receive status (rdesc.status) layout of non-last descriptor 19 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 0 reserved eop = 0b dd
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 302 ? macsec processing: this bit is set each time macsec processing of the packet was attempted (such as a macsec header was found and macsec offload is enabled) regardless if a matched sa was found. this bit is not set for clear packets even if they have a macsec header (such as secp packets). ? ipsec processing: this bit is set only if a matched sa was found. note that hardware does not process packets with the ipv4 option or ipv6 extension header and secp is not set. 1 lb (18) - this bit provides a loopback status indication meaning that this packet is sent by a local virtual machine (vm-to-vm switch indication). extended error (12) table 7-16 and the text that follows describes the possible errors reported by hardware. rxe (bit 11) ? rxe is described in the legacy descriptor format in section 7.1.4 . ipe (bit 10) ? the ipe error indication is described in the legacy descriptor format in section 7.1.4 . l4e (bit 9) ? l4 error indication ? when set, indicates that hardware attempted to do an l4 integrity check as described in the l4i bit, but the check failed. security error (bit 8:7) macsec status indicates potential errors in the macsec processing according to the following encoding. ipsec status (valid only if sa match, else zero) 2 1. ipsec functionality not available in 82576ns. table 7-16. receive errors (rdesc.errors) layout 11 10 9 8 7 6 4 3 2 0 rxe ipe l4e secerr reserved hbo reserved code error type 00b no error 01b no sa match 10b replay error 11b bad signature 2. ipsec functionality not available in the 82576ns.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 303 indicates potential errors in the ipsec processing according to the following encoding: reserved (bit 6:4) hbo (bit 3) ? header buffer overflow note: this bit is relevant only if sph is set. 1. in both header replication modes, hbo is set if the header size (as calculated by hardware) is bigger than the allocated buffer size ( srrctl.bsizeheader ) but the replication still takes place up to the header buffer size. hardware sets this bit in order to indicate to software that it needs to allocate bigger buffers for the headers. 2. in header split mode, when srrctl[n] bsizeheader is smaller than hdr_len , then hbo is set to 1b, in this case, the header is not split. instead, the header resides within the host packet buffer. the hdr_len field is still valid and equal to the calculated size of the header. however, the header is not copied into the header buffer. 3. in header split mode, always use header buffer mode, when srrctl[n] bsizeheader is smaller than hdr_len , then hbo is set to 1b. in this case, the header buffer is used as part of the data buffers and contains the first bsizeheader bytes of the packet. the hdr_len field is still valid and equal to the calculated size of the header. note: most error information appears only when the store?bad?packet bit ( rctl.sbp ) is set and a bad packet is received. using srrctl.bsizeheader , the maximum buffer size supported is 960 bytes. reserved (bits 2:0) ? reserved pkt_len (16) ? number of bytes existing in the host packet buffer the length covers the data written to a receive buffer including crc bytes (if any). software must read multiple descriptors to determine the complete length for packets that span multiple receive buffers. if srrctl.desc_type = 4 (advanced descriptor header replication large packet only) and the total packet length is smaller than the size of the header buffer (no replication is done), this field continues to reflect the size of the packet, although no data is written to the packet buffer. otherwise, if the buffer is not split because the header is bigger than the allocated header buffer, this field reflects the size of the data written to the first packet buffer (header and data). vlan tag (16) these bits are described in the legacy descriptor format in section 7.1.4 . code error type 00b no error, either no sa match (secp is cleared), or the incoming packet was successfully authenticated by hardware. 1 1. for incoming ipv4 packets where the protocol field is ah/esp, and for which any ipv4 option is present; or for incoming ipv6 packets where there is an ah/esp extension header together with any other extension header (even another ah/esp extension header), no ipsec or layer4 offload is performed by hardware and the packet is passed to software with the secp bit cleared (su ch as no sa match) - without performing any sa lookup. 01b invalid ipsec protocol, the protocol field value included in the ip header (or in the ip next header for ipv6) does not match the proto field stored in the corresponding rx sa entry. 10b packet length error, esp packet is not 4-bytes aligned or the ah/esp header is truncated (for example, a 28-byte ipv4 packet with ipv4 header and esp header that contains only spi and sn) or ah length field content in ah header is not valid (i.e. not equal to 0x07 for ipv4 or to 0x08 for ipv6). 11b authentication failed. for example, the computed icv field does not match the icv field included in the packet.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 304 7.1.6 receive descriptor fetching the fetching algorithm attempts to make the best use of pcie bandwidth by fetching a cache-line (or more) descriptor with each burst. the following paragraphs briefly describe the descriptor fetch algorithm and the software control provided. when the on-chip buffer is empty, a fetch happens as soon as any descriptors are made available (host writes to the tail pointer). when the on-chip buffer is nearly empty (rxdctl.pthresh), a prefetch is performed each time enough valid descriptors (rxdctl.hthresh) are available in host memory. when the number of descriptors in host memory is greater than the available on-chip descriptor storage, the 82576 might elect to perform a fetch that is not a multiple of cache-line size. hardware performs this non-aligned fetch if doing so results in the next descriptor fetch being aligned on a cache- line boundary. this enables the descriptor fetch mechanism to be most efficient in the cases where it has fallen behind software. all fetch decisions are based on the number of descriptors available and do not take into account any split of the transaction due to bus access limitations. note: the 82576 never fetches descriptors beyond the descriptor tail pointer. 7.1.7 receive descriptor write-back processors have cache-line sizes that are larger than the receive descriptor size (16 bytes). consequently, writing back descriptor information for each received packet would cause expensive partial cache-line updates. a receive descriptor packing mechanism minimizes the occurrence of partial line write-backs. to maximize memory efficiency, receive descriptors are packed together and written as a cache-line whenever possible. descriptors write-backs accumulate and are opportunistically written out in cache line-oriented chunks, under the following scenarios: ? rxdctl.wthresh descriptors have been used (the specified maximum threshold of unwritten used descriptors has been reached). ? the receive timer expires (eitr) ? in this case all descriptors are flushed ignoring any cache-line boundaries. ? explicit software flush (rxdctln.swfls). ? dynamic packets ? if at least one of the descriptors that are waiting for write-back are classified as packets requiring immediate notification the entire queue is flushed out. when the number of descriptors specified by rxdctl.wthresh have been used, they are written back regardless of cache-line alignment. it is therefore recommended that wthresh be a multiple of cache- line size. when the receive timer (eitr) expires, all used descriptors are forced to be written back prior to initiating the interrupt, for consistency. software might explicitly flush accumulated descriptors by writing the rxdctln register with the swfls bit set. when the 82576 does a partial cache-line write-back, it attempts to recover to cache-line alignment on the next write-back. for applications where the latency of received packets is more important that the bus efficiency and the cpu utilization, an eitr value of zero may be used. in this case, each receive descriptor will be written to the host immediately. if rxdctl.wthresh equals zero, then each descriptor will be written back separately, otherwise, write back of descriptors may be coalesced if descriptor accumulates in the internal descriptor ring due to bandwidth constrains.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 305 all write-back decisions are based on the number of descriptors available and do not take into account any split of the transaction due to bus access limitations. 7.1.8 receive descriptor ring structure figure 7-9 shows the structure of each of the 16 receive descriptor rings. hardware maintains 16 circular queues of descriptors and writes back used descriptors just prior to advancing the head pointer(s). head and tail pointers wrap back to base when size descriptors have been processed. software inserts receive descriptors by advancing the tail pointer(s) to refer to the address of the entry just beyond the last valid descriptor. this is accomplished by writing the descriptor tail register(s) with the offset of the entry beyond the last valid descriptor. the hardware adjusts its internal tail pointer(s) accordingly. as packets arrive, they are stored in memory and the head pointer(s) is incremented by hardware. when the head pointer(s) is equal to the tail pointer(s), the queue(s) is empty. hardware stops storing packets in system memory until software advances the tail pointer(s), making more receive buffers available. figure 7-9. receive descriptor ring structure
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 306 the receive descriptor head and tail pointers reference to 16-byte blocks of memory. shaded boxes in figure 7-9 represent descriptors that have stored incoming packets but have not yet been recognized by software. software can determine if a receive buffer is valid by reading the descriptors in memory. any descriptor with a non-zero dd value has been processed by the hardware and is ready to be handled by the software. note: the head pointer points to the next descriptor that is written back. after the descriptor write-back operation completes, this pointer is incremented by the number of descriptors written back. hardware owns all descriptors between [head..tail]. any descriptor not in this range is owned by software. the receive descriptor rings are described by the following registers: ? receive descriptor base address (rdba15 to rdba0) registers: this register indicates the start of the descriptor ring buffer. this 64-bit address is aligned on a 16- byte boundary and is stored in two consecutive 32-bit registers. note that hardware ignores the lower 4 bits. ? receive descriptor length (rdlen15 to rdlen0) registers: this register determines the number of bytes allocated to the circular buffer. this value must be a multiple of 128 (the maximum cache-line size). since each descriptor is 16 bytes in length, the total number of receive descriptors is always a multiple of eight. ? receive descriptor head (rdh15 to rdh0) registers: this register holds a value that is an offset from the base and indicates the in-progress descriptor. there can be up to 64 kb, 8 kb descriptors in the circular buffer. hardware maintains a shadow copy that includes those descriptors completed but not yet stored in memory. ? receive descriptor tail (rdt15 to rdt0) registers: this register holds a value that is an offset from the base and identifies the location beyond the last descriptor hardware can process. this is the location where software writes the first new descriptor. if software statically allocates buffers, uses legacy receive descriptors, and uses memory read to check for completed descriptors, it has to zero the status byte in the descriptor before bumping the tail pointer to make it ready for reuse by hardware. zeroing the status byte is not a hardware requirement but is necessary for performing an in-memory scan. all the registers controlling the descriptor rings behavior should be set before receive is enabled, apart from the tail registers that are used during the regular flow of data. 7.1.8.1 low receive descriptors threshold as described above, the size of the receive queues is measured by the number of receive descriptor. during run time the software processes completed descriptors and then increments the receive descriptor tail registers (rdt). at the same time, the hardware may post new packets received from the lan incrementing the receive descriptor head registers (rdh) for each used descriptor. the number of usable (free) descriptors for the hardware is the distance between tail and head registers. when the tail reaches the head, there are no free descriptors and further packets may be either dropped or block the receive fifo. in order to avoid it, the 82576 may generate a low latency interrupt (associated to the relevant rx queue) once there are less equal free descriptors than a threshold. the threshold is defined in 16 descriptors granularity per queue in the srrctl[n].rdmts field.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 307 7.1.9 header splitting and replication 7.1.9.1 purpose this feature consists of splitting or replicating packet's header to a different memory space. this helps the host to fetch headers only for processing: headers are replicated through a regular snoop transaction in order to be processed by the host cpu. it is recommended to perform this transaction with the dca feature enabled (see section 8.3) or in conjunction with a software-prefetch. the packet (header and payload) is stored in memory through a (optionally) non-snoop transaction. later, a data movement engine transaction moves the payload from the software device driver buffer to application memory or it is moved using a normal memory copy operation. the 82576 supports header splitting in several modes: ? legacy mode: legacy descriptors are used; headers and payloads are not split. ? advanced mode, no split: advanced descriptors are in use; header and payload are not split. ? advanced mode, split: advanced descriptors are in use; header and payload are split to different buffers. if the packet cannot be split, only the packet buffer is used. ? advanced mode, replication: advanced descriptors are in use; header is replicated in a separate buffer and also in a payload buffer. ? advanced mode, replication, conditioned by packet size: advanced descriptors are in use; replication is performed only if the packet is larger than the header buffer size. ? advanced mode, split, always use header buffer: advanced descriptors are in use; header and payload are split to different buffers. if no split is done, the first part of the packet is stored in the header buffer. 7.1.9.2 description in figure 7-10 and figure 7-11 , the header splitting and header replication modes are shown.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 308 the physical address of each buffer is written in the buffer addresses fields. the sizes of these buffers are statically defined by bsizepacket in the srrctl[n] registers. the packet buffer address includes the address of the buffer assigned to the replicated packet, including header and data payload portions of the received packet. in the case of a split header, only the payload is included. the header buffer address includes the address of the buffer that contains the header information. the receive dma module stores the header portion of the received packets into this buffer. figure 7-10. header splitting figure 7-11. header replication
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 309 the 82576 uses the packet replication or splitting feature when the srrctl[n].desctype is larger that one. the software device driver must also program the buffer sizes in the srrctl[n] registers. when header split is selected, the packet is split only on selected types of packets. a bit exists for each option in psrtype[n] registers so several options can be used in conjunction with them. if one or more bits are set, the splitting is performed for the corresponding packet type. the following table lists the behavior of the 82576 in the different modes: software notes: ? if srrctl#.nse is set, all buffers' addresses in a packet descriptor must be word aligned. ? packet header can't span across buffers, therefore, the size of the header buffer must be larger than any expected header size. otherwise, only the part of the header fitting the header buffer is replicated. in the case of header split mode (srrctl.desctype = 010b), a packet with a header larger than the header buffer is not split. table 7-17. intel? 82576 gbe controller behavior desctype condition sph hbo pkt_len hdr_len header and payload dma split 1. header can't be decoded 0b 0b min(packet length, buffer size) n/a header + payload ? packet buffer 2. header <= bsizeheader 1b 0b min(payload length, buffer size) 1 1. in a header only packet (such as tcp ack packet), the pkt_len is zero. header size header ? header buffer payload ? packet buffer 3. header > bsizeheader 1b 1b min(packet length, buffer size) header size 2 2. the hdr_len doesn't reflect the actual data size stored in the header buffer. it reflects the header size determined by the p arser. header + payload ? packet buffer split ? always use header buffer 1. packet length <= bsizeheader 0b 0b zero packet length header + payload ? header buffer 2. header can?t be decoded and packet length > bsizeheader 0b 0b min(packet length ? bsizeheader, data buffer size) bsizehead er header + payload ? header + packet buffers 3 3. if the packet spans more than one descriptor, only the header buffer of the first descriptor is used. the header buffer is us ed for the first part of the packet until it is filled up, and then the first packet buffer is used for the continuation of the packet . 3. header <= bsizeheader and packet length >= bsizeheader 1b 0b min(payload length, data buffer size) header size header ? header buffer payload ? packet buffer 4. header > bsizeheader 1b 1b min(packet length ? bsizeheader, data buffer size) header size 2 header + payload ? header + packet buffer 3 replicate large packet only 1. header + payload <= bsizeheader 0b/ 1b 4 0b packet length header size, n/a 4 header + payload ? header buffer 2. header + payload > bsizeheader 0b/ 1b 4 0b/ 1b 5 min(packet length, buffer size) header size, n/a 4 (header + payload)(partial 6 ) ? header buffer header + payload ? packet buffer
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 310 7.1.10 receive packet checksum off loading the 82576 supports the off loading of three receive checksum calculations: the packet checksum, the ipv4 header checksum, and the tcp/udp checksum. the packet checksum is the one's complement over the receive packet, starting from the byte indicated by rxcsum.pcss (zero corresponds to the first byte of the packet), after stripping. for packets with a vlan header, the packet checksum includes the header if vlan striping is not enabled by the ctrl.vme. if a vlan header strip is enabled, the packet checksum and the starting offset of the packet checksum exclude the vlan header due to masking of vlan header. for example, for an ethernet ii frame encapsulated as an 802.3ac vlan packet and ctrl.vme is set and with rxcsum.pcss set to 14, the packet checksum would include the entire encapsulated frame, excluding the 14-byte ethernet header (da, sa, type/length) and the 4-byte q-tag. the packet checksum does not include the ethernet crc if the rctl.secrc bit is set. software must make the required offsetting computation (to back out the bytes that should not have been included and to include the pseudo-header) prior to comparing the packet checksum against the tcp checksum stored in the packet. for supported packet/frame types, the entire checksum calculation can be off loaded to the 82576. if rxcsum.ipofl is set to 1b, the 82576 calculates the ipv4 checksum and indicates a pass/fail indication to software via the ipv4 checksum error bit ( rdesc.ipe ) in the error field of the receive descriptor. similarly, if rxcsum.tuofl is set to 1b, the 82576 calculates the tcp or udp checksum and indicates a pass/fail condition to software via the tcp/udp checksum error bit ( rdesc.l4e ). these error bits are valid when the respective status bits indicate the checksum was calculated for the packet ( rdesc.ipcs and rdesc.l4cs , respectively). similarly, if rfctl.ipv6_dis and rfctl.ip6xsum_dis are cleared to 0b and rxcsum.tuofl is set to 1b, the 82576 calculates the tcp or udp checksum for ipv6 packets. it then indicates a pass/fail condition in the tcp/udp checksum error bit ( rdesc.l4e ). if neither rxcsum.ipofl nor rxcsum.tuofl are set, the checksum error bits (ipe and l4e) are 0b for all packets. supported frame types: ? ethernet ii ? ethernet snap
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 311 7.1.10.1 filters details the previous table lists general details about what packets are processed. in more detail, the packets are passed through a series of filters to determine if a receive checksum is calculated: 7.1.10.1.1 mac address filter this filter checks the mac destination address to be sure it is valid (such as ia match, broadcast, multicast, etc.). the receive configuration settings determine which mac addresses are accepted. see the various receive control configuration registers such as rctl (rtcl.upe, rctl.mpe, rctl.bam), mta, ral, and rah. 7.1.10.1.2 snap/vlan filter table 7-18. supported receive checksum capabilities packet type hardware ip checksum calculation hardware tcp/udp checksum calculation ipv4 packets. yes yes ipv6 packets. no (n/a) yes ipv6 packet with next header options: ? hop-by-hop options ? destinations options ? routing (with len zero) ? routing (with len > zero) ? fragment ? home option no (n/a) no (n/a) no (n/a) no (n/a) no (n/a) no (n/a) yes yes yes no no no ipv4 tunnels: ? ipv4 packet in an ipv4 tunnel. ? ipv6 packet in an ipv4 tunnel. no yes (ipv4) no yes 1 1. the ipv6 header portion can include supported extension headers as described in the ipv6 filter section. ipv6 tunnels: ? ipv4 packet in an ipv6 tunnel. ? ipv6 packet in an ipv6 tunnel. no no no no packet is an ipv4 fragment. yes no packet is greater than 1518/1522/1526 bytes; (lpe=1b). yes yes packet has 802.3ac tag. yes yes ipv4 packet has ip options (ip header is longer than 20 bytes). yes yes packet has tcp or udp options. yes yes ip header?s protocol field contains a protocol number other than tcp or udp. yes no
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 312 this filter checks the next headers looking for an ip header. it is capable of decoding ethernet ii, ethernet snap, and ieee 802.3ac headers. it skips past any of these intermediate headers and looks for the ip header. the receive configuration settings determine which next headers are accepted. see the various receive control configuration registers such as rctl (rctl.vfe), vet, and vfta. 7.1.10.1.3 ipv4 filter this filter checks for valid ipv4 headers. the version field is checked for a correct value (4). ipv4 headers are accepted if they are any size greater than or equal to five (dwords). if the ipv4 header is properly decoded, the ip checksum is checked for validity. the rxcsum.ipofl bit must be set for this filter to pass. 7.1.10.1.4 ipv6 filter this filter checks for valid ippv6 headers, which are a fixed size and have no checksum. the ipv6 extension headers accepted are: hop-by-hop, destination options, and routing. the maximum size next header accepted is 16 dwords (64 bytes). 7.1.10.1.5 ipv6 extension headers ipv4 and tcp provide header lengths, which enable hardware to easily navigate through these headers on packet reception for calculating checksum and crcs, etc. for receiving ipv6 packets; however, there is no ip header length to help hardware find the packet's ulp (such as tcp or udp) header. one or more ipv6 extension headers might exist in a packet between the basic ipv6 header and the ulp header. the hardware must skip over these extension headers to calculate the tcp or udp checksum for received packets. the ipv6 header length without extensions is 40 bytes. the ipv6 field next header type indicates what type of header follows the ipv6 header at offset 40. it might be an upper layer protocol header such as tcp or udp ( next header type of 6 or 17, respectively), or it might indicate that an extension header follows. the final extension header indicates with its next header type field the type of ulp header for the packet. ipv6 extension headers have a specified order. however, destinations must be able to process these headers in any order. also, ipv6 (or ipv4) might be tunneled using ipv6, and thus another ipv6 (or ipv4) header and potentially its extension headers might be found after the extension headers. the ipv4 next header type is at byte offset nine. in ipv6, the first next header type is at byte offset six. all ipv6 extension headers have the next header type in their first eight bits. most have the length in the second eight bits (offset byte[1]) as shown: table 7-19. typical ipv6 extended header format (traditional representation) 0 1 2 3 4 5 6 7 1 8 9 0 1 2 3 4 5 2 3 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 next header type length
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 313 the following table lists the encoding of the next header type field and information on determining each header type's length. the ipv6 extension headers are not otherwise processed by the 82576 so their details are not covered here. note that the 82576 hardware acceleration does not support all ipv6 extension header types (refer to table 7-20 ). also, the rfctl.ipv6_dis bit must be cleared for this filter to pass. 7.1.10.1.6 udp/tcp filter this filter checks for a valid udp or tcp header. the prototype next header values are 0x11 and 0x06, respectively. the rxcsum.tuofl bit must be set for this filter to pass. table 7-20. header type encoding and lengths header next header type header length (units are bytes unless otherwise specified) ipv6 6 always 40 bytes ipv4 4 offset bits[7:4] unit = 4 bytes tcp 6 offset byte[12].bits[7:4] unit = 4 bytes udp 17 always 8 bytes hop by hop options 0 (note 1) 8+offset byte[1] destination options 60 8+offset byte[1] routing 43 8+offset byte[1] fragment 44 always 8 bytes authentication 51 8+4*(offset byte[1]) encapsulating security payload 50 note 3 no next header 59 note 2 notes: 1. hop-by-hop options header is only found in the first next header type of an ipv6 header. 2. when a no next header type is encountered, the rest of the packet should not be processed. 3. encapsulated security payload ? intel? 82576 gbe controller cannot offload packets with this header type. table 7-19. typical ipv6 extended header format (traditional representation)
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 314 7.1.10.2 receive udp fragmentation checksum the 82576 might provide receive fragmented udp checksum offload. the 82576 should be configured in the following manner to enable this mode: the rxcsum.pcsd bit should be cleared. the packet checksu m and ip identification fields are mutually exclusive with the rss hash. when the pcsd bit is cleared, packet checksum and ip identification are active instead of rss hash. the rxcsum.ippcse bit should be set. this field enables the ip payload checksum enable that is designed for the fragmented udp checksum. the rxcsum.pcss field must be zero. the packet checksum start should be zero to enable auto-start of the checksum calculation. the following table lists the exact description of the checksum calculation. the following table also lists the outcome descriptor fields for the following incoming packets types: note: when the software device driver computes the 16-bit ones complement, the sum on the incoming packets of the udp fragments, it should expect a value of 0xffff. refer to section 7.1.10 for supported packet formats. 7.1.11 sctp offload if a receive packet is identified as sctp, the 82576 checks the crc32 checksum of this packet and identifies this packet as sctp. software is notified of the crc check via the crcv bit in the extended status field of the rx descriptor. the detection of an sctp packet is indicated via the sctp bit in the packet type field of the rx descriptor. the checker assumes the following sctp packet format: table 7-21. descriptor fields incoming packet type fragment checksum udpv udpcs / l4cs non ip packet 0b 0b 0b / 0b ipv6 packet 0b 0b depends on transport header. non fragmented ipv4 packet 0b 0b depends on transport header. fragmented ipv4, when not first fragment the unadjusted one?s complement checksum of the ip payload. 0b 1b / 0b fragmented ipv4, for the first fragment same as above 1 if the udp header checksum is valid (not zero) 1b / 0b table 7-22. sctp header 0 1 2 3 4 5 6 7 1 8 9 0 1 2 3 4 5 2 6 7 8 9 0 1 2 3 3 4 5 6 7 8 9 0 1 source port destination port verification tag checksum chunks 1..n
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 315 7.2 transmit functionality 7.2.1 packet transmission output packets are made up of pointer-length pairs constituting a descriptor chain (descriptor based transmission). software forms transmit packets by assembling the list of pointer-length pairs, storing this information in the transmit descriptor, and then updating the on-chip transmit tail pointer to the descriptor. the transmit descriptor and buffers are stored in host memory. hardware typically transmits the packet only after it has completely fetched all the l2 packet data from host memory and deposited it into the on-chip transmit fifo. this permits tcp or udp checksum computation and avoids problems with pcie under-runs. another transmit feature of the 82576 is tcp/udp segmentation. the hardware has the capability to perform packet segmentation on large data buffers offloaded from the network operating system (nos). this feature is discussed in detail in section 7.2.4 . in addition, the 82576 supports sctp offloading for transmit requests. see section section 7.2.5.3 for details about sctp. 7.2.1.1 transmit data storage data is stored in buffers pointed to by the descriptors. alignment of data is on an arbitrary byte boundary with the maximum size per descriptor limited only to the maximum allowed packet size (9728 bytes). a packet typically consists of two (or more) buffers, one (or more) for the header and one for the actual data. each buffer is referenced by a different descriptor. some software implementations copy the header(s) and packet data into one buffer and use only one descriptor per transmitted packet. 7.2.1.2 on-chip tx buffers the 82576 contains a 40 kb packet buffer that can be used to store packets until they are forwarded to the network or locally to another virtual machine (vm). 7.2.1.3 on-chip descriptor buffers the 82576 contains a 32 descriptor cache for each transmit queue used to reduce the latency of packet processing and to optimize the usage of the pcie bandwidth by fetching and writing back descriptors in bursts. the fetch and writeback algorithm are described in section 7.2.2.5 and section 7.2.2.6 . 7.2.1.4 transmit contexts the 82576 provides hardware checksum offload and tcp/udp segmentation facilities. these features enable tcp and udp packet types to be handled more efficiently by performing additional work in hardware, thus reducing the software overhead associated with preparing these packets for transmission. part of the parameters used by these features is handled though contexts. a context refers to a set of device registers loaded or accessed as a group to provide a particular function. the 82576 supports 32 context register sets on-chip (two per queue). the transmit queues can contain transmit data descriptors (much like the receive queue) as well as transmit context descriptors.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 316 the contexts are queue specific and one context cannot be reused from one queue to another. this differs from the method used in previous devices that supported a pool of contexts to be shared between queues. a transmit context descriptor differs from a data descriptor as it does not point to packet data. instead, this descriptor provides the ability to write to the on-chip contexts that support the transmit checksum offloading and the segmentation features of the 82576. the 82576 supports one type of transmit context. this on-chip context is written with a transmit context descriptor dtyp=2 and is always used for transmit data descriptor dtyp=3. the idx field contains an index to one of the two queue contexts. software must track what context is stored in each idx location. each advanced data descriptor that uses any of the advanced offloading features must refer to a context. contexts can be initialized with a transmit context descriptor and then used for a series of related transmit data descriptors. the context, for example, defines the checksum and offload capabilities for a given type of tcp/ip flow. all packets of this type can be sent using this context. software is responsible for ensuring that a context is only overwritten when it is no longer needed. hardware does not include any logic to manage the on-chip contexts; it is completely up to software to populate and then use the on-chip context table. each context defines information about the packet sent including the total size of the mac header (tdesc.machdr), the amount of payload data that should be included in each packet (tdesc.mss), tcp header length (tdesc.tcphdr), ip header length (tdesc.iphdr), and information about what type of protocol (tcp, ip, etc.) is used. other than tcp, ip (tdesc.tucmd), most information is specific to the segmentation capability. because there are dedicated on-chip resources for contexts, they remain constant until they are modified by another context descriptor. this means that a context can be used for multiple packets (or multiple segmentation blocks) unless a new context is loaded prior to each new packet. depending on the environment, it might be unnecessary to load a new context for each packet. for example, if most traffic generated from a given node is standard tcp frames, this context could be setup once and used for many frames. only when some other frame type is required would a new context need to be loaded by software. this new context could use a different index or the same index. this same logic can also be applied to the tcp/udp segmentation scenario, though the environment is a more restrictive one. in this scenario, the host is commonly asked to send messages of the same type, tcp/ip for instance, and these messages also have the same maximum segment size (mss). in this instance, the same context could be used for multiple tcp messages that require hardware segmentation. 7.2.2 transmit descriptors the 82576 supports legacy descriptors and the 82576 advanced descriptors. legacy descriptors are intended to support legacy drivers to enable fast platform power up and to facilitate debug. note: these descriptors must not be used with advanced features such as virtualizationor macsec are used.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 317 if legacy descriptors are used when ctrl_ext.rt or dtxswc.loopback enable or status.vfe or one of the dtxswc.macas bits or one of the dtxswc.vlanas bits are set, packets are ignored and not sent. the legacy descriptors are recognized as such based on the dext bit as discussed later in this section. in addition, the 82576 supports two types of advanced transmit descriptors: 1. advanced transmit context descriptor, dtyp = 0010b. 2. advanced transmit data descriptor, dtyp = 0011b. note: dtyp values 0000b and 0001b are reserved. the transmit data descriptor (both legacy and advanced) points to a block of packet data to be transmitted. the advanced transmit context descriptor does not point to packet data. it contains control/context information that is loaded into on-chip registers that affect the processing of packets for transmission. the following sections describe the descriptor formats. 7.2.2.1 legacy transmit descriptor format legacy descriptors are identified by having bit 29 of the descriptor ( tdesc.dext ) set to 0b. in this case, the descriptor format is defined as shown in table 7-23 . note that the address and length must be supplied by software. also note that bits in the command byte are optional, as are the cso, and css fields. note: for frames that spans multiple descriptors, the vlan, css, cso, cmd.vle, cmd.ic, and cmd.ifcs are valid only in the first descriptors and are ignored in the subsequent ones. 7.2.2.1.1 address (64) physical address of a data buffer in host memory that contains a portion of a transmit packet. 7.2.2.1.2 length length ( tdesc.length ) specifies the length in bytes to be fetched from the buffer address provided; the maximum length associated with any single legacy descriptor is 9728 bytes. note: the maximum allowable packet size for transmits changes based on the value written to the tx packet buffer allocation (txpbs) register. table 7-23. transmit descriptor (tdesc) fetch layout ? legacy mode 63 48 47 40 39 36 35 32 31 24 23 16 15 0 0 buffer address [63:0] 8 vlan css extcmd sta cmd cso length table 7-24. transmit descriptor (tdesc) write-back layout ? legacy mode 63 48 47 40 39 36 35 32 31 24 23 16 15 0 0 reserved reserved 8 vlan css reserved sta cmd cso length
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 318 descriptor length(s) might be limited by the size of the transmit fifo. all buffers comprising a single packet must be able to be stored simultaneously in the transmit fifo. for any individual packet, the sum of the individual descriptors' lengths must be below 9728 bytes. note: descriptors with zero length (null descriptors) transfer no data. null descriptors can only appear between packets and must have their eop bits set. if the tctl.psp bit is set, the total length of the packet transmitted, not including fcs should be at least 17 bytes. 7.2.2.1.3 checksum offset and start ? cso and css a checksum offset ( tdesc.cso ) field indicates where, relative to the start of the packet, to insert a tcp checksum if this mode is enabled. a checksum start ( tdesc.css ) field indicates where to begin computing the checksum. both cso and css are in units of bytes and must be in the range of data provided to the 82576 in the descriptor. this means for short packets that are not padded by software, css and cso must be in the range of the unpadded data length, not the eventual padded length (64 bytes). cso must be larger than css, css must be equal or greater than 14 bytes, and cso must be smaller than the packet length minus four bytes. checksum calculation is not done if cso or css are out of range. this occurs if (css > length) or (cso > length - 1). in the case of an 802.1q header, the offset values depend on the vlan insertion enable ( vle ) bit. if they are not set (vlan tagging included in the packet buffers), the offset values should include the vlan tagging. if these bits are set (vlan tagging is taken from the packet descriptor), the offset values should exclude the vlan tagging. note: software must compute an offsetting entry to back out the bytes of the header that are not part of the ip pseudo header and should not be included in the tcp checksum and store it in the position where the hardware computed checksum is to be inserted. hardware does not add the 802.1q ethertype or the vlan field following the 802.1q ethertype to the checksum. so for vlan packets, software can compute the values to back out only on the encapsulated packet rather than on the added fields. udp checksum calculation is not supported by the legacy descriptor as when using legacy descriptors. the 82576 is not aware of the l4 type of the packet and thus, does not support the translation of a checksum result of 0x0000 to 0xffff needed to differentiate between an udp packet with a checksum of zero and an udp packet without checksum. because the cso field is eight bits wide, it puts a limit on the location of the checksum to 255 bytes from the beginning of the packet. hardware adds the checksum to the field at the offset indicated by the cso field. checksum calculations are for the entire packet starting at the byte indicated by the css field. a value of zero corresponds to the first byte in the packet. css must be set in the first descriptor for a packet. 7.2.2.1.4 command byte ? cmd the cmd byte stores the applicable command and has the fields shown in figure 7-25 . table 7-25. transmit command (tdesc.cmd) layout 7 6 5 4 3 2 1 0 rsv vle dext rsv rs ic ifcs eop
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 319 ? rsv (bit 7) ? reserved ? vle (bit 6) ? vlan packet enable ? dext (bit 5) ? descriptor extension (0 for legacy mode) ? reserved (bit 4) ? reserved ? rs (bit 3) ? report status ? ic (bit 2) ? insert checksum ? ifcs (bit 1) ? insert fcs ? eop (bit 0) ? end of packet vle: indicates that the packet is a vlan packet. for example, hardware should add the vlan ethertype and an 802.1q vlan tag to the packet. rs: signals the hardware to report the status information. this is used by software that does in- memory checks of the transmit descriptors to determine which ones are done. for example, if software queues up 10 packets to transmit, it can set the rs bit in the last descriptor of the last packet. if software maintains a list of descriptors with the rs bit set, it can look at them to determine if all packets up to (and including) the one with the rs bit set have been buffered in the output fifo. looking at the status byte and checking the descriptor done ( dd ) bit do this. if dd is set, the descriptor has been processed. refer to figure 7-27 for the layout of the status field. ic: if set, requests hardware to add the checksum of the data from css to the end of the packet at the offset indicated by the cso field. ifcs: when set, hardware appends the mac fcs at the end of the packet. when cleared, software should calculate the fcs for proper crc check. there are several cases in which software must set ifcs: ? transmitting a short packet while padding is enabled by the tctl.psp bit. ? checksum offload is enabled by the ic bit in the tdesc.cmd. ? vlan header insertion enabled by the vle bit in the tdesc.cmd or by the vmvir registers. ? macsec offload is requested. eop, when set, indicates the last descriptor making up the packet. note that one or many descriptors can be used to form a packet. note: as opposed to 82571eb: vle, ifcs, cso, and ic must be set correctly in the first descriptor of each packet. in previous silicon generations, some of these bits were required to be set in the last descriptor of a packet. 7.2.2.1.5 status ? sta table 7-26. vlan tag insertion decision table vle action 0b send generic ethernet packet. 1b send 802.1q packet; the ethernet type field comes from the vet register and the vlan data comes from the vlan field of the tx descriptor; note: this table is relevant only if vmvir.vlana = 00b (use descriptor command) for the queue.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 320 one bit provides transmit status, when rs is set in the command: dd indicates that the descriptor is done and is written back after the descriptor has been processed. note: when head write-back is enabled, the write-back of the dd bit to the descriptor is not executed. 7.2.2.1.6 dd (bit 0) ? descriptor done status 7.2.2.1.7 vlan the vlan field is used to provide the 802.1q/802.1ac tagging information. the vlan field is qualified only on the first descriptor of each packet when the vle bit is set. the rule for vlan tag is to use network ordering (also called big endian). it appears in the following manner in the descriptor: ? vlan id ? the 12-bit tag indicating the vlan group of the packet. ? canonical form indication (cfi) ? set to zero for ethernet packets. ? pri ? indicates the priority of the packet. note: the vlan tag should be sent in network order. 7.2.2.2 advanced transmit context descriptor 7.2.2.2.1 iplen (9) ip header length. if an offload is requested, iplen must be greater than or equal to 20 and less than or equal to 511. for ipsec flows, it includes the length of the ipsec header. 7.2.2.2.2 maclen (7) table 7-27. transmit status (tdesc.sta) layout 3 2 1 0 reserved dd table 7-28. vlan field (tdesc.vlan) layout 15 13 12 11 0 pri cfi vlan id table 7-29. transmit context descriptor (tdesc) layout ? (type = 0010b) 63 40 39 32 31 16 15 9 8 0 0 reserved ipsec sa index vlan maclen iplen 63 48 47 40 39 38 36 35 30 29 28 24 23 20 19 9 8 0 8 mss l4len rs v idx reserved de xt rsv dtyp tucmd ipsec esp_len
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 321 this field indicates the length of the mac header. when an offload is requested (one of tse or ixsm or txsm is set), machdr must be larger than or equal to 14 and less than or equal to 127. this field should include only the part of the l2 header supplied by the software device driver and not the parts added by hardware. the following table lists the value of maclen in the different cases. vlan (16) ? 802.1q vlan tag to be inserted in the packet during transmission. this vlan tag is inserted and needed only when a packet using this context has its dcmd.vle bit set. this field should include the entire 16-bit vlan field including the cfi and priority fields as shown in figure 7-28 . note: the vlan tag should be sent in network order. 7.2.2.2.3 ipsec sa idx (8) ipsec sa index. if an ipsec offload is requested for the packet (ipsec bit is set in the advanced tx data descriptor), indicates the index in the sa table where the ipsec key and salt are stored for that flow. 7.2.2.2.4 reserved (24) 7.2.2.2.5 ips_esp_len (9) size of the esp trailer and esp icv appended by software. meaningful only if the ipsec_type bit is set in the tucmd field and to single send packets for which the ipsec bit is set in their advanced tx data descriptor. 7.2.2.2.6 tucmd (11) ? rsv (bit 10-6) ? reserved ? encryption (bit5) ? esp encryption offload is required. meaningful only to packets for which the ipsec bit is set in their advanced tx data descriptor. ? ipsec_type (bit 4) ? set for esp. cleared for ah. meaningful only to packets for which the ipsec bit is set in their advanced tx data descriptor. ? l4t (bit 3:2) ? l4 packet type (00b: udp; 01b: tcp; 10b: sctp; 11b: rsv) ? ipv4 (bit 1) ? ip packet type: when 1b, ipv4; when 0b, ipv6 ? snap (bit 0) ? snap indication table 7-30. maclen values snap regular vlan extended vlan maclen no by hardware or no no 14 no by hardware or no yes 18 no by software no 18 no by software yes 22 yes by hardware or no no 22 yes by hardware or no yes 26 yes by software no 26 yes by software yes 30
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 322 7.2.2.2.7 dtyp (4) always 0010b for this type of descriptor. 7.2.2.2.8 rsv (5) reserved. 7.2.2.2.9 dext descriptor extension (1b for advanced mode). 7.2.2.2.10 rsv (6) reserved. 7.2.2.2.11 idx (3) index into the hardware context table where this context is stored. 7.2.2.2.12 rsv (1) 7.2.2.2.13 l4len (8) layer 4 header length. if tse is set in the data descriptor pointing to this context, this field must be greater than or equal to 12 and less than or equal to 255. otherwise, this field is ignored. 7.2.2.2.14 mss (16) controls the maximum segment size (mms). this specifies the maximum tcp payload segment sent per frame, not including any header or trailer. the total length of each frame (or section) sent by the tcp/udp segmentation mechanism (excluding ethernet crc) as follows: total length is equal to: maclen + 4(if vle set) + 4 or 8(if cmtgi is set or if also rlttgi is set - assuming bcntlen is clear) + iplen + l4len + mss + [padlen + 18](if esp packet) the one exception is the last packet of a tcp/udp segmentation, which is typically shorter. mss is ignored when dcmd.tse is not set. padlen ranges from 0 to 3 in tx. it is the content of the esp padding length field that is computed when offloading esp in cipher blocks of 16-bytes (aes-128) with respect to the following alignment formula: [l4len + mss + padlen + 2] modulo(4) = 0 for single send packets: ips_esp_len = padlen + 18. note: the headers lengths must meet the following:
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 323 maclen + iplen + l4len <= 512 the context descriptor requires valid data only in the fields used by the specific offload options. the following table lists the required valid fields according to the different offload options. 7.2.2.3 advanced transmit data descriptor note: for frames that spans multiple descriptors, all fields apart from dcmd.eop, dcmd.rs, dcmd.dext, dtalen, address and dtyp are valid only in the first descriptors and are ignored in the subsequent ones. table 7-31. valid field in context vs. required offload required offload valid fields in context tse txsm ixsm ipsec vlan l4len iplen maclen mss l4t ipv4 ipsec sa index ipsec esp_ len 1b 1b x 0b vle yes yes yes yes ye s yes no no 1b 1b x 1b vle yes yes yes yes ye s yes yes ipse c_ty pe 0b 1b x 0b vle no yes yes no ye s yes no no 0b 1b x 1b vle no yes yes no ye s yes yes ipse c_ty pe 0b 0b 1b 0b vle no yes yes no no yes no no 0b 0b 1b 1b vle no yes yes no no yes yes ipse c_ty pe 0b 0b 0b 0b no context required unless vle is set. 0b 0b 0b 1b vle no yes yes no no yes yes ipse c_ty pe table 7-32. advanced tx descriptor read format 0 address[63:0] 8 paylen popts cc idx sta dcmd dtyp mac rsv dtale n 63 46 45 40 39 38 36 35 32 31 24 23 20 19 18 17 16 15 0 table 7-33. advanced tx descriptor write-back format 0 rsv 8 rsv sta rsv 63 36 35 32 31 0
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 324 7.2.2.3.1 address (64) physical address of a data buffer in host memory that contains a portion of a transmit packet. 7.2.2.3.2 dtalen (16) length in bytes of data buffer at the address pointed to by this specific descriptor. note: if the tctl.psp bit is set, the total length of the packet transmitted, not including fcs, should be at least 17 bytes. 7.2.2.3.3 rsv (2) reserved. 7.2.2.3.4 mac (2) ? ilsec (bit 0) - apply macsec on packet ? 1588 (bit 1) ? ieee1588 timestamp packet. ilsec, when set, hardware includes the macsec header (sectag) and macsec header digest (signature). the macsec processing is defined by the enable tx macsec field in the lsectxctrl register. the ilsec bit in the packet descriptor should not be set if macsec processing is not enabled by the enable tx macsec field. if the ilsec bit is set erroneously while the enable tx macsec field is set to 00b, then the packet is dropped. 7.2.2.3.5 dtyp (4) 0011b is the value for this descriptor type. 7.2.2.3.6 dcmd (8) ? tse (bit 7) ? tcp/udp segmentation enable ? vle (bit 6) ? vlan packet enable ? dext (bit 5) ? descriptor extension (1b for advanced mode) ? reserved (bit 4) ? rs (bit 3) ? report status ? reserved (bit 2) ? ifcs (bit 1) ? insert fcs ? eop (bit 0) ? end of packet tse indicates a tcp/udp segmentation request. when tse is set in the first descriptor of a tcp packet, hardware must use the corresponding context descriptor in order to perform tcp segmentation. the type of segmentation applied is defined according to the tucmd.l4t field in the context descriptor. note: it is recommended that tctl.psp be enabled when tse is used since the last frame can be shorter than 60 bytes - resulting in a bad frame if psp is disabled. vle indicates that the packet is a vlan packet and hardware must add the vlan ethertype and an 802.1q vlan tag to the packet.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 325 dext must be 1b to indicate advanced descriptor format (as opposed to legacy). rs signals hardware to report the status information. this is used by software that does in-memory checks of the transmit descriptors to determine which ones are done. for example, if software queues up 10 packets to transmit, it can set the rs bit in the last descriptor of the last packet. if software maintains a list of descriptors with the rs bit set, it can look at them to determine if all packets up to (and including) the one with the rs bit set have been buffered in the output fifo. looking at the status byte and checking the dd bit do this. if dd is set, the descriptor has been processed. refer to the sections that follow for the layout of the status field. note: descriptors with zero length transfer no data. ifcs, when set, hardware appends the mac fcs at the end of the packet. when cleared, software should calculate the fcs for proper crc check. there are several cases in which software must set ifcs: ? transmitting a short packet while padding is enabled by the tctl.psp bit. ? checksum offload is enabled by the either txsm or ixsm bits in the tdesc.dcmd. ? vlan header insertion enabled by the vle bit in the tdesc.dcmd. ? tcp/udp segmentation offload enabled by tse bit in the tdesc.dcmd. eop indicates whether this is the last buffer for an incoming packet. 7.2.2.3.7 sta (4) ? rsv (bits 1-3) ? reserved ? dd (bit 0) ? descriptor done 7.2.2.3.8 idx (3) index into the hardware context table to indicate which context should be used for this request. if no offload is required, this field is not relevant and no context needs to be initiated before the packet is sent. see table 7-31 for details in which packets require a context reference. 7.2.2.3.9 rsv (1) reserved. set to 0. 7.2.2.3.10 popts (6) ? rsv (bit 5:3) ? reserved ? ipsec (bit 2) ? ipsec offload request ? txsm (bit 1) ? insert l4 checksum ? ixsm (bit 0) ? insert ip checksum txsm, when set, indicates that l4 checksum should be inserted. in this case, tucmd.l4t indicates whether the checksum is tcp, udp, or sctp. when tucmd.tse is set, txsm must be set to 1b. if this bit is set, the packet should at least contain a tcp header.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 326 ixsm, when set, indicates that ip checksum should be inserted. for ipv6 packets, this bit must be cleared. if the tucmd.tse bit is set, and tucmd.ipv4 is set, ixsm must be set as well. if this bit is set, the packet should at least contain an ip header. 7.2.2.3.11 paylen (18) paylen indicates the size (in byte units) of the data buffer(s) in host memory for transmission. in a single send packet, paylen defines the entire packet size fetched from host memory. it does not include the fields that hardware adds such as: optional vlan tagging, ethernet crc or ethernet padding. when macsec offload is enabled, it does not include the macsec encapsulation. when ipsec offload is enabled, it does not include the esp trailer added by hardware. in a large send case (regardless if it is transmitted on a single or multiple packets), paylen defines the protocol payload size fetched from host memory. in tcp or udp segmentation offload, paylen defines the tcp/udp payload size. note: when a packet spreads over multiple descriptors, all the descriptor fields are only valid in the first descriptor of the packet, except for rs, which is always checked, dtalen that reflects the size of the buffer in the current descriptor and eop, which is always set at last descriptor of the series. 7.2.2.4 transmit descriptor ring structure the transmit descriptor ring structure is shown in figure 7-12 . a pair of hardware registers maintains each transmit descriptor ring in the host memory. new descriptors are added to the queue by software by writing descriptors into the circular buffer memory region and moving the tail pointer associated with that queue. the tail pointer points to one entry beyond the last hardware owned descriptor. transmission continues up to the descriptor where head equals tail at which point the queue is empty. descriptors passed to hardware should not be manipulated by software until the head pointer has advanced past them.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 327 the shaded boxes in the figure represent descriptors that are not currently owned by hardware that software can modify. the transmit descriptor ring is described by the following registers: ? transmit descriptor base address register (tdba 0-15): this register indicates the start address of the descriptor ring buffer in the host memory; this 64-bit address is aligned on a 16-byte boundary and is stored in two consecutive 32-bit registers. hardware ignores the lower four bits. ? transmit descriptor length register (tdlen 0-15): this register determines the number of bytes allocated to the circular buffer. this value must be zero modulo 128. ? transmit descriptor head register (tdh 0-15): this register holds a value that is an offset from the base and indicates the in-progress descriptor. there can be up to 64 kb descriptors in the circular buffer. reading this register returns the value of head corresponding to descriptors already loaded in the output fifo. this register reflects the internal head of the hardware write-back process including the descriptor in the posted write pipe and might point further ahead than the last descriptor actually written back to the memory. ? transmit descriptor tail register (tdt 0-15): this register holds a value, which is an offset from the base, and indicates the location beyond the last descriptor hardware can process. this is the location where software writes the first new descriptor. the driver should not handle to the 82576 descriptors that describes a partial packet. consequently, the number of descriptors used to describe a packet can not be larger than the ring size. figure 7-12. transmit descriptor ring structure
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 328 the base register indicates the start of the circular descriptor queue and the length register indicates the maximum size of the descriptor ring. the lower seven bits of length are hard wired to 0b. byte addresses within the descriptor buffer are computed as follows: address = base + (ptr * 16), where ptr is the value in the hardware head or tail register. the size chosen for the head and tail registers permit a maximum of 65528 (64 kb by 8) descriptors, or approximately 16 kb packets for the transmit queue given an average of four descriptors per packet. once activated, hardware fetches the descriptor indicated by the hardware head register. the hardware tail register points one beyond the last valid descriptor. software can read detect which packets had already been processed by hardware as follows: ? read the head register to determine which packets (those logically before the head) have been transferred to the on-chip fifo or transmitted. note that this method is not recommended as races between the internal update of the head register and the actual write-back of descriptors might occur. ? read the value of the head as stored at the address pointed by the tdbah/tdbal pair. ? track the dd bits in the descriptor ring. all the registers controlling the descriptor rings behavior should be set before transmit is enabled, apart from the tail registers which are used during the regular flow of data. note: software can determine if a packet has been sent by either of three methods: setting the rs bit in the transmit descriptor command field or by performing a pio read of the transmit head register, or by reading the head value written by the 82576 to the address pointed by the tdwbal and tdwbah registers (see section 7.2.3 for details). checking the transmit descriptor dd bit or head value in memory eliminates a potential race condition. all descriptor data is written to the i/o bus prior to incrementing the head register, but a read of the head register could pass the data write in systems performing i/ o write buffering. updates to transmit descriptors use the same i/o write path and follow all data writes. consequently, they are not subject to the race. in general, hardware prefetches packet data prior to transmission. hardware typically updates the value of the head pointer after storing data in the transmit fifo. 7.2.2.5 transmit descriptor fetching the descriptor processing strategy for transmit descriptors is essentially the same as for receive descriptors except that a different set of thresholds are used. as for receives, the number of on-chip transmit descriptors has been increased (from 8 to 64) and the fetch and write-back algorithms modified. when there is an on-chip descriptor buffer empty, a fetch happens as soon as any descriptors are made available (host writes to the tail pointer). if several on-chip descriptor queues are in this situation at the same time, the highest indexed queue must be served first and so forth, down to the lowest indexed queue. a queue is considered empty for the transmit descriptor fetch algorithm as long as: ? there is still not at least one complete packet (single or large send) in its corresponding internal queue. ? there is no descriptor already in its way from system memory to the internal cache. ? the internal corresponding internal descriptor cache is not full.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 329 each time a descriptor fetch request is sent for an empty queue, the maximum available number of descriptor is requested, regardless of cache alignment issues. when the on-chip buffer is nearly empty (txdctl[n].pthresh), a prefetch is performed each time enough valid descriptors (txdctl[n].hthresh) are available in host memory and no other dma activity of greater priority is pending (descriptor fetches and write-backs or packet data transfers). if several on-chip descriptor queues are in this situation at the same time, then start from the more starved queue, and among those equally starved, start from the highest indexed queue, as before. note: the starvation level of a queue corresponds to the number of descriptors above the prefetch threshold that are already in the internal queue. the queue is more starved if there a less decorators in the internal queue. comparing starvation level might be done roughly, not at the descriptor level of resolution. when the number of descriptors in host memory is greater than the available on-chip descriptor storage, the 82576 might elect to perform a fetch that is not a multiple of cache-line size. hardware performs this non-aligned fetch if doing so results in the next descriptor fetch being aligned on a cache- line boundary. this enables the descriptor fetch mechanism to be more efficient in the cases where it has fallen behind software. note: the 82576 never fetches descriptors beyond the descriptor tail pointer. 7.2.2.6 transmit descriptor write-back the descriptor write-back policy for transmit descriptors is similar to that for receive descriptors when the txdctl[n].wthresh value is not 0b. in this case, all descriptors are written back regardless of the value of their rs bit. when the txdctl[n].wthresh value is 0b, since transmit descriptor write-backs do not happen for every descriptor (controlled by rs in the transmit descriptor), only descriptors that have rs bit set are written back. any descriptor write-back includes the full 16 bytes of the descriptor. since the benefit of delaying and then bursting transmit descriptor write-backs is small at best, it is likely that the threshold is left at the default value (0b) to force immediate write-back of transmit descriptors and to preserve backward compatibility. descriptors are written back in one of three cases: ? txdctl[n].wthresh = 0b and a descriptor which has rs set is ready to be written back ? the corresponding eitr counter has reached zero ? txdctl[n].wthresh > 0b and txdctl[n].wthresh descriptors have accumulated for the first condition, write-backs are immediate. this is the default operation and is backward compatible with previous device implementations. the other two conditions are only valid if descriptor bursting is enabled ( section 8.12.13 ). in the second condition, the eitr counter is used to force timely write-back of descriptors. the first packet after timer initialization starts the timer. timer expiration flushes any accumulated descriptors and sets an interrupt event (txdw). for the final condition, if txdctl[n].wthresh descriptors are ready for write-back, the write-back is performed.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 330 an additional mode in which transmit descriptors are not written back at all and the head pointer of the descriptor ring is written instead as described in section 7.2.3 . 7.2.3 tx completions head write-back in legacy hardware, transmit requests are completed by writing the dd bit to the transmit descriptor ring. this causes cache thrash since both the software device driver and hardware are writing to the descriptor ring in host memory. instead of writing the dd bits to signal that a transmit request completed, hardware can write the contents of the descriptor queue head to host memory. the software device driver reads that memory location to determine which transmit requests are complete. in order to improve the performance of this feature, the software device driver needs to program dca registers to configure which cpu is processing each tx queue. 7.2.3.1 description the head counter is reflected in a memory location that is allocated by software, for each queue. head write-back occurs if tdwbal#.head_wb_en is set for this queue and the rs bit is set in the tx descriptor, following corresponding data upload into packet buffer. if the head write-back feature is enabled, the 82576 ignores wtresh and takes in account only descriptors with the rs bit set (as if the wtresh was set to 0b). in addition, the head write-back occurs upon eitr expiration for queues where the wb_on_eitr field in tdwbal is set. the software device driver has control on this feature through tx queue 0-15 head write-back address, low and high (thus allowing 64-bit address). see in section 8.12.8 and section 8.12.9 . the low register's lsb hold the control bits. ? the head_wb_en bit enables activation of tail write-back. in this case, no descriptor write-back is executed. ? the 30 upper bits of this register hold the lowest 32 bits of the head write-back address, assuming that the two last bits are zero. the high register holds the high part of the 64-bit address. note: hardware writes a full dword when writing this value, so software should reserve enough space for each head value and make sure the tdbal value is dword aligned. if software enables head write-back, it must also disable pci express relaxed ordering on the write-back transactions. this is done by disabling bit 11 in the txctl register for each active transmit queue. see section 8.13.2 . the 82576 might update the head with values that are larger then the last head pointer which holds a descriptor with rs bit set, but still the value will always point to a free descriptor (descriptor that are not owned by the the 82576 anymore). 7.2.4 tcp/udp segmentation hardware tcp segmentation is one of the offloading options supported by the windows* and linux* tcp/ip stack. this is often referred to as tcp segmentation offloading or tso. this feature enables the tcp/ip stack to pass to the network device driver a message to be transmitted that is bigger than the maximum transmission unit (mtu) of medium. it is then the responsibility of the software device driver and hardware to divide the tcp message into mtu size frames that have appropriate layer 2 (ethernet), 3 (ip), and 4 (tcp) headers. these headers must include sequence number, checksum fields, options
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 331 and flag values as required. note that some of these values (such as the checksum values) are unique for each packet of the tcp message and other fields such as the source ip address are constant for all packets associated with the tcp message. the 82576 supports also udp segmentation for embedded applications, although this offload is not supported by the regular windows* and linux* stacks. any reference in this section to tcp segmentation, should be considered as referring to both tcp and udp segmentation. padding (tctl.psp) must be enabled in tcp segmentation mode, since the last frame might be shorter than 60 bytes, resulting in a bad frame if psp is disabled. the offloading of these mechanisms to the software device driver and the 82576 save significant cpu cycles. note that the software device driver shares the additional tasks to support these options. 7.2.4.1 assumptions the following assumptions apply to the tcp segmentation implementation in the 82576: ? the rs bit operation is not changed. ? interrupts are set after data in buffers pointed to by individual descriptors is transferred (dma'd) to hardware. 7.2.4.2 transmission process the transmission process for regular (non-tcp segmentation packets) involves: ? the protocol stack receives from an application a block of data that is to be transmitted. ? the protocol stack calculates the number of packets required to transmit this block based on the mtu size of the media and required packet headers. for each packet of the data block: ? ethernet, ip and tcp/udp headers are prepared by the stack. ? the stack interfaces with the software device driver and commands it to send the individual packet. ? the software device driver gets the frame and interfaces with the hardware. ? the hardware reads the packet from host memory (via dma transfers). ? the software device driver returns ownership of the packet to the network operating system (nos) when hardware has completed the dma transfer of the frame (indicated by an interrupt). the transmission process for the 82576 tcp segmentation offload implementation involves: ? the protocol stack receives from an application a block of data that is to be transmitted. ? the stack interfaces to the software device driver and passes the block down with the appropriate header information. ? the software device driver sets up the interface to the hardware (via descriptors) for the tcp segmentation context. hardware dma's (transfers) the packet data and performs the ethernet packet segmentation and transmission based on offset and payload length parameters in the tcp/ip context descriptor including: ? packet encapsulation ? header generation and field updates including ipv4, ipv6, and tcp/udp checksum generation
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 332 ? the software device driver returns ownership of the block of data to the nos when hardware has completed the dma transfer of the entire data block (indicated by an interrupt). 7.2.4.2.1 tcp segmentation data fetch control to perform tcp segmentation in the 82576, the dma must be able to fit at least one packet of the segmented payload into available space in the on-chip packet buffer. the dma does various comparisons between the remaining payload and the packet buffer available space, fetching additional payload and sending additional packets as space permits. in order to enable interleaving between descriptor queues at the ethernet frame resolution inside tso requests. for doing so, the frame header pointed by the so called header descriptors are reread from system memory by hardware for every lso segment again, storing in an internal cache only the header?s descriptors instead of the header?s content. in the aim to limit the internal cache dimensions, software is required to spread the header on maximum 4 descriptors, while still allowed to mix header and data in the last header buffer. this limitation stands for up to layer4 header included, and for ipv4 or ipv6 indifferently. 7.2.4.2.2 tcp segmentation write-back modes as the tcp segmentation mode uses the buffers that contains the header of the packet multiple time, there are some limitation on the usage of the different combination of writeback and buffer release methods in order to guarantee the header buffers availability until the entire packet is processed. these limitations are described in the table below. table 7-34. write back options for large send wthresh rs head write back enable hardware behavior software expected behavior for tso packets. 0 set in eop descriptors only disable hardware writes back descriptors with rs bit set one at a time. software can retake ownership of all descriptors up to last descriptor with dd bit set. 0 set in any descriptors disable hardware writes back descriptors with rs bit set one at a time. software can retake ownership of entire packets (eop bit set) up to last descriptor with dd bit set. 0 not set at all disable hardware does not write back any descriptor (since rs bit is not set) software should poll the tdh register. the tdh register reflects the last descriptor that software can take ownership of. 1 >0 don't care disable hardware writes back all the descriptors in bursts and set all the dd bits. software can retake ownership of entire packets up to last descriptor with both dd and eop bits set. don?t care not set at all enable hardware writes back the head pointer only at eitr expire event reflecting the last descriptor that software can take ownership of. software may poll the tdh register or use the head value written back at eitr expire event. the tdh register reflects the last descriptor that software can take ownership of. don't care set in eop descriptors only enable hardware writes back the head pointer per each descriptor with rs bit set. 2 software can retake ownership of all descriptors up to the descriptor pointed by the head pointer read from system memory (by interrupt or polling). don't care set in any descriptors enable hardware writes back the head pointer per each descriptor with rs bit set. this mode is illegal since software won't access the descriptor, it cannot tell when the pointer passed the eop descriptor.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 333 7.2.4.3 tcp segmentation performance performance improvements for a hardware implementation of tcp segmentation off-load include: ? the stack does not need to partition the block to fit the mtu size, saving cpu cycles. ? the stack only computes one ethernet, ip, and tcp header per segment, saving cpu cycles. ? the stack interfaces with the device driver only once per block transfer, instead of once per frame. ? larger pci bursts are used which improves bus efficiency (such as lowering transaction overhead). ? interrupts are easily reduced to one per tcp message instead of one per packet. ? fewer i/o accesses are required to command the hardware. 7.2.4.4 packet format typical tcp/ip transmit window size is 8760 bytes (about 6 full size frames). today the average size on corporate intranets is 12-14kb, and normally the maximum window size allowed is 64kb (unless windows scaling - rfc 1323 is specified). a tcp message can be as large as 256 kb and is generally fragmented across multiple pages in host memory. the 82576 partitions the data packet into standard ethernet frames prior to transmission. the 82576 supports calculating the ethernet, ip, tcp, and udp headers, including checksum, on a frame-by-frame basis. frame formats supported by the 82576 include: ? ethernet 802.3 ? ieee 802.1q vlan (ethernet 802.3ac) ? ethernet type 2 ? ethernet snap ? ipv4 headers with options ? ipv4 headers without options with one ah/esp ipsec header ? ipv6 headers with extensions ? tcp with options ? udp with options. 1. note that polling of the tdh register is a valid method only when the rs bit is never set, otherwise race conditions between software and hardware accesses to the descriptor ring can occur. 2. at eitr expire event, the hardware writes back the head pointer reflecting the last descriptor that software can take ownersh ip of. table 7-35. tcp/ip or udp/ip packet format sent by host l2/l3/l4 headers data ethernet ipv4/ipv6 tcp/udp data (full tcp message) table 7-36. tcp/ip or udp/ip packet format sent by 82576 l2/l3/l4 header (updated) data (first mss) fcs ... l2/l3/l4 header (updated) data (next mss) fcs ...
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 334 vlan tag insertion might be handled by hardware note: udp (unlike tcp) is not a ?reliable protocol?, and fragmentation is not supported at the udp level. udp messages that are larger than the mtu size of the given network medium are normally fragmented at the ip layer. this is different from tcp, where large tcp messages can be fragmented at either the ip or tcp layers depending on the software implementation. the 82576 has the ability to segment udp traffic (in addition to tcp traffic), however, because udp packets are generally fragmented at the ip layer, the 82576's ?tcp segmentation? feature is not normally conducive to handling udp traffic. 7.2.4.5 tcp/udp segmentation indication software indicates a tcp/udp segmentation transmission context to the hardware by setting up a tcp/ ip context transmit descriptor (see section 7.2.2 ). the purpose of this descriptor is to provide information to the hardware to be used during the tcp segmentation off-load process. setting the tse bit in the tucmd field to 1b indicates that this descriptor refers to the tcp segmentation context (as opposed to the normal checksum off loading context). this causes the checksum off loading, packet length, header length, and maximum segment size parameters to be loaded from the descriptor into the device. the tcp segmentation prototype header is taken from the packet data itself. software must identity the type of packet that is being sent (ipv4/ipv6, tcp/udp, other), calculate appropriate checksum off loading values for the desired checksum, and calculate the length of the header which is pre-appended. the header might be up to 240 bytes in length. once the tcp segmentation context has been set, the next descriptor provides the initial data to transfer. this first descriptor(s) must point to a packet of the type indicated. furthermore, the data it points to might need to be modified by software as it serves as the prototype header for all packets within the tcp segmentation context. the following sections describe the supported packet types and the various updates which are performed by hardware. this should be used as a guide to determine what must be modified in the original packet header to make it a suitable prototype header. the following summarizes the fields considered by the driver for modification in constructing the prototype header. ip header for ipv4 headers: ? identification field should be set as appropriate for first packet of send (if not already) ? header checksum should be zeroed out unless some adjustment is needed by the driver tcp header ? sequence number should be set as appropriate for first packet of send (if not already) ? psh, and fin flags should be set as appropriate for last packet of send ? tcp checksum should be set to the partial pseudo-header sum as follows (there is a more detailed discussion of this is section 7.2.4.6 ):
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 335 udp header ? checksum should be set as in tcp header, above the following sections describe the updating process performed by the hardware for each frame sent using the tcp segmentation capability. 7.2.4.6 transmit checksum offloading with tcp/ud segmentation the 82576 supports checksum off-loading as a component of the tcp segmentation off-load feature and as a standalone capability. section 7.2.5 describes the interface for controlling the checksum off- loading feature. this section describes the feature as it relates to tcp segmentation. the 82576 supports ip and tcp header options in the checksum computation for packets that are derived from the tcp segmentation feature. note: the 82576 is capable of computing one level of ip header checksum and one tcp/udp header and payload checksum. in case of multiple ip headers, the driver needs to compute all but one ip header checksum. the 82576 calculates check sums on the fly on a frame-by- frame basis and inserts the result in the ip/tcp/udp headers of each frame. tcp and udp checksum are a result of performing the checksum on all bytes of the payload and the pseudo header. two specific types of checksum are supported by the hardware in the context of the tcp segmentation off-load feature: ? ipv4 checksum ? tcp checksum each packet that is sent via the tcp segmentation off-load feature optionally includes the ipv4 checksum and either the tcp checksum. all checksum calculations use a 16-bit wide one's complement checksum. the checksum word is calculated on the outgoing data. table 7-37. tcp partial pseudo-header sum for ipv4 ip source address ip destination address zero layer 4 protocol id zero table 7-38. tcp partial pseudo-header sum for ipv6 ipv6 source address ipv6 final destination address zero zero next header
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 336 the table below summarizes the conditions of when checksum off loading can/should be calculated. 7.2.4.7 ip/tcp/udp header updating ip/tcp or ip/udp header is updated for each outgoing frame based on the ip/tcp header prototype which hardware dma's from the first descriptor(s). the checksum fields and other header information are later updated on a frame-by-frame basis. the updating process is performed concurrently with the packet data fetch. the following sections define what fields are modified by hardware during the tcp segmentation process by the 82576. note: software must make paylen and hdrlen value of context descriptors correct. otherwise, the failure of large send due to either under-run or over-run might cause hardware to send bad packets or even cause tx hardware to hang. the indication of large send failure can be checked in the tsctfc statistic register. 7.2.4.7.1 tcp/ip/udp header for the first frames the hardware makes the following changes to the headers of the first packet that is derived from each tcp segmentation context. mac header (for snap) ? type/len field = mss + maclen + iplen + l4len - 14 table 7-39. supported transmit checksum capabilities packet type hardware ip checksum calculation hardware tcp/udp checksum calculation ip v4 packets yes yes ip v6 packets (no ip checksum in ipv6) na yes packet is greater than 1518/1522/1526 bytes; (lpe=1b). yes yes packet has 802.3ac tag yes yes packet has ip options (ip header is longer than 20 bytes) yes yes packet has tcp or udp options yes yes ip header?s protocol field contains a protocol # other than tcp or udp. yes no table 7-40. conditions for checksum off loading packet type ipv4 tcp/udp reason non tso yes no ip raw packet (non tcp/udp protocol) yes yes tcp segment or udp datagram with checksum off-load no no non-ip packet or checksum not offloaded tso yes yes for tso, checksum off-load must be done
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 337 ipv4 header ? ip total length = mss + l4len + iplen ? ip checksum ipv6 header ? payload length = mss + l4len + ipv6_hdr_extension 1 tcp header ? sequence number: the value is the sequence number of the first tcp byte in this frame. ? the flag values of the first frame are set by anding the flag word in the pseudo header with the dtxtcpflgl.tcp_flg_first_seg. the default value of the dtxtcpflgl.tcp_flg_first_seg are set so that if the fin flag and the psh flag are cleared in the first frame. ? tcp checksum 7.2.4.7.2 tcp/ip/udp headers for the subsequent frames the hardware makes the following changes to the headers for subsequent packets that are derived as part of a tcp segmentation context: number of bytes left for transmission = paylen - (n * mss). where n is the number of frames that have been transmitted. mac header (for snap packets) type/len field = mss + maclen + iplen + l4len - 14 ipv4 header ? ip identification: incremented from last value (wrap around) ? ip total length = mss + l4len + iplen ? ip checksum ipv6 header ? payload length = mss + l4len + ipv6_hdr_extension 2 tcp header ? sequence number update: add previous tcp payload size to the previous sequence number value. this is equivalent to adding the mss to the previous sequence number. ? the flag values of the subsequent frames are set by anding the flag word in the pseudo header with the dtxtcpflgl.tcp_flg_mid_seg. the default value of the dtxtcpflgl.tcp_flg_mid_seg are set so that if the fin flag and the psh flag are cleared in these frames. ? tcp checksum udp header ? udp length = mss + l4len ? udp checksum 1. ipv6_hdr_extension is calculated as iplen - 40 bytes. 2. ipv6_hdr_extension is calculated as iplen - 40 bytes.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 338 7.2.4.7.3 tcp/ip/udp headers for the last frame the hardware makes the following changes to the headers for the last frame of a tcp segmentation context: last frame payload bytes = paylen - (n * mss) mac header (for snap packets) ? type/len field = last frame payload bytes + maclen + iplen + l4len - 14 ipv4 header ? ip total length = last frame payload bytes + l4len + iplen ? ip identification: incremented from last value (wrap around based on 16 bit-width) ? ip checksum ipv6 header ? payload length = last frame payload bytes + l4len + ipv6_hdr_extension 2 tcp header ? sequence number update: add previous tcp payload size to the previous sequence number value. this is equivalent to adding the mss to the previous sequence number. ? the flag values of the last frames are set by anding the flag word in the pseudo header with the dtxtcpflgh.tcp_flg_lst_seg. the default value of the dtxtcpflgh.tcp_flg_lst_seg are set so that if the fin flag and the psh flag are set in the last frame. ? tcp checksum udp header ? udp length = last frame payload bytes + l4len ? udp checksum 7.2.4.8 ip/tcp/udp checksum offloading the 82576 performs checksum off loading as part of the tcp segmentation off-load feature. these specific checksum are supported under tcp segmentation: ? ipv4 checksum ? tcp checksum see section 7.2.5 for description of checksum off loading of a single-send packet. 7.2.4.9 data flow the flow used by the 82576 to do a tcp segmentation is as follow: 1. get a descriptor with a request for a tso off-load of a tcp packet. 2. first segment processing:
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 339 a. fetch all the buffers containing the header as calculated by the maclen, iplen & l4len fields. save the addresses and lengths of the buffers containing the header (up to 4 buffers). the header content is not saved. b. fetch data up to the mss from subsequent buffers & calculate the adequate checksum(s). c. update the header accordingly and update internal state of the packet (next data to fetch and ip sn). d. send the packet to the network. e. if total packet was sent, go to step 4. else continue. 3. next segments a. wait for next arbitration of this queue. b. fetch all the buffers containing the header from the saved addresses. subsequent reads of the header might be done with a no snoop attribute. c. fetch data up to the mss or end of packet form subsequent buffers & calculate the adequate checksum(s. d. update the header accordingly and update internal state of the packet (next data to fetch and ip sn). e. if total packet was sent, request is done, else restart from step 3. 4. release all buffers (update head pointer). note: descriptors are fetched in a parallel process according to the consumption of the buffers. 7.2.5 checksum offloading in non-segmentation mode the previous section on tcp segmentation off-load describes the ip/tcp/udp checksum off loading mechanism used in conjunction with tcp segmentation. the same underlying mechanism can also be applied as a standalone feature. the main difference in normal packet mode (non-tcp segmentation) is that only the checksum fields in the ip/tcp/udp headers need to be updated. before taking advantage of the 82576's enhanced checksum off-load capability, a checksum context must be initialized. for the normal transmit checksum off-load feature this is performed by providing the device with a tcp/ip context descriptor with tucmd.tse=0b. setting tse=0b indicates that the normal checksum context is being set, as opposed to the segmentation context. for additional details on contexts, refer to section 7.2.2.4 . note: enabling the checksum off loading capability without first initializing the appropriate checksum context leads to unpredictable results. crc appending (cmd.ifcs) must be enabled in tcp/ip checksum mode, since crc must be inserted by hardware after the checksum have been calculated. as mentioned in section 7.2.2 , transmit descriptors, it is not necessary to set a new context for each new packet. in many cases, the same checksum context can be used for a majority of the packet stream. in this case, some performance can be gained by only changing the context on an as needed basis or electing to use the off-load feature only for a particular traffic type, thereby avoiding all context descriptors except for the initial one. each checksum operates independently. insertion of the ip and tcp checksum for each packet are enabled through the transmit data descriptor popts.tsxm and popts.ixsm fields, respectively.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 340 7.2.5.1 ip checksum three fields in the transmit context descriptor set the context of the ip checksum off loading feature: ? tucmd.ipv4 ? iplen ? maclen tucmd.ipv4=1b specifies that the packet type for this context is ipv4, and that the ip header checksum should be inserted. tucmd.ipv4=0b indicates that the packet type is ipv6 (or some other protocol) and that the ip header checksum should not be inserted. maclen specifies the byte offset from the start of the dma'd data to the first byte to be included in the checksum, the start of the ip header. the minimal allowed value for this field is 12. note that the maximum value for this field is 127. this is adequate for typical applications. note: the maclen+iplen value needs to be less than the total dma length for a packet. if this is not the case, the results are unpredictable. iplen specifies the ip header length. maximum allowed value for this field is 511 bytes. maclen+iplen specify where the ip checksum should stop. this is limited to the first 127+511 bytes of the packet and must be less than or equal to the total length of a given packet. if this is not the case, the result is unpredictable. note: for ipsec packet offloaded by hardware in tx, it is assumed that iplen provided by software in the tx context descriptor is the sum of the ip header length with the ipsec header length. thus for the ipv4 header checksum off-load, hardware could no more rely on the iplen field provided by software in the tx context descriptor, but should rely on the fact that no ipv4 options are present in the packet. consequently, for ipsec off-load packets hardware computes ip header checksum over always a fixed amount of 20-bytes. the 16-bit ipv4 header checksum is placed at the two bytes starting at maclen+10. as mentioned in section 7.2.2.2 , transmit contexts, it is not necessary to set a new context for each new packet. in many cases, the same checksum context can be used for a majority of the packet stream. in this case, some performance can be gained by only changing the context on an as needed basis or electing to use the off-load feature only for a particular traffic type, thereby avoiding all context descriptors except for the initial one. 7.2.5.2 tcp checksum three fields in the transmit context descriptor set the context of the tcp checksum off loading feature: ? maclen ? iplen ? tucmd.l4t tucmd.l4t=1b specifies that the packet type is tcp, and that the 16-bit tcp header checksum should be inserted at byte offset maclen+iplen+16. tucmd.l4t=0b indicates that the packet is udp and that the 16-bit checksum should be inserted starting at byte offset maclen+iplen+6. iplen+maclen specifies the byte offset from the start of the dma'd data to the first byte to be included in the checksum, the start of the tcp header. the minimal allowed value for this sum is 32/42 for udp or tcp respectively.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 341 note: the iplen+maclen+l4len value needs to be less than the total dma length for a packet. if this is not the case, the results are unpredictable. the tcp/udp checksum always continues to the last byte of the dma data. note: for non-tso, software still needs to calculate a full checksum for the tcp/udp pseudo- header. this checksum of the pseudo-header should be placed in the packet data buffer at the appropriate offset for the checksum calculation. 7.2.5.3 sctp crc offloading for sctp packets, a crc32 checksum offload is provided. three fields in the transmit context descriptor set the context of the stcp checksum off loading feature: ? maclen ? iplen ? tucmd.l4t tucmd.l4t=10b specifies that the packet type is sctp, and that the 32-bit stcp crc should be inserted at byte offset maclen+iplen+8. iplen+maclen specifies the byte offset from the start of the dma'd data to the first byte to be included in the checksum, the start of the stcp header. the minimal allowed value for this sum is 26. the sctp crc calculation always continues to the last byte of the dma data. the sctp total l3 payload size (paylen - iplen - maclen) should be a multiple of 4 bytes (sctp padding not supported). note: tso is not available for sctp packets. software must initialize the sctp crc field to zero (0x00000000). 7.2.5.4 checksum supported per packet types the following table summarizes which checksum is supported per packet type. note: tso is not supported for packet types for which ip checksum & tcp checksum can not be calculated. table 7-41. checksum per packet type packet type hardware ip checksum calculation hardware tcp/udp/sctp checksum calculation ipv4 packets yes yes ipv6 packets no (n/a) yes
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 342 7.2.6 multiple transmit queues the number of transmit queues is increased to 16, to match the expected number of processors on most server platforms and to support the new virtualization mode. if there are more cpus than queues, then one queue might be used to service more than one cpu. for transmission process, each thread might set a queue in the host memory of the cpu it is tied to. 7.2.6.1 bandwidth allocation to virtual machines / transmit queues when operated in either vmdq2 or sr-iov mode, the 82576 has the ability to control the tx bandwidth used by each virtual machine (vm). since in these virtualization modes each tx queue is owned by a separate vm (or a separate set of vms), bandwidth allocation to vms is performed by assigning bandwidth shares to tx queues. a rate-controller is internally associated to a tx queue to maintain its allocated bandwidth share. ipv6 packet with next header options: ? hop-by-hop options ? destinations options ? routing (w len 0b) ? routing (w len >0b) ? fragment ? home option ? security option (ah/esp) no (n/a) no (n/a) no (n/a) no (n/a) no (n/a) no (n/a) yes 1 yes yes yes no no no yes 1 ipv4 tunnels: ? ipv4 packet in an ipv4 tunnel ? ipv6 packet in an ipv4 tunnel either ip or tcp/sctp 2 either ip or tcp/sctp 2 either ip or tcp/sctp 2 either ip or tcp/sctp 2 ipv6 tunnels: ? ipv4 packet in an ipv6 tunnel ? ipv6 packet in an ipv6 tunnel no no yes yes packet is an ipv4 fragment yes no packet is greater than 1518/1522/1526 bytes; (lpe=1b). yes yes packet has 802.3ac tag yes yes ipv4 packet has ip options and no ipsec header (ip header is longer than 20 bytes) yes yes ipv4 packet has ipsec header without ip options yes 1 yes 1 packet has tcp or udp options yes yes ip header?s protocol field contains protocol # other than tcp or udp. yes no 1. only offloaded flows 2. for the tunneled case, the driver might do only the tcp checksum or ipv4 checksum. if tcp checksum is desired, the driver sho uld define the ip header length as the combined length of both ip headers in the packet. if an ipv4 checksum is required, the ip he ader length should be set to the ipv4 header length. table 7-41. checksum per packet type (continued)
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 343 the bandwidth share represents the minimum percentage of the link?s bandwidth that is guaranteed to be granted to the vm. bandwidth unused by a vm is re-distributed among others according to their relative bandwidth shares. if the pcie bandwidth available for transmission get below half of the link?s bandwidth, the bandwidth allocation to vms scheme will degenerate into a scheme close to a packet-based round-robin arbitration between the vms. if the link is operated at 10mbps, bandwidth allocation to vms must be disabled as in non-virtualized contexts, and tx queues are served in a packet-based round-robin manner. a vm can be operated in a ?bandwidth takeover? mode, where it takes over for itself all bandwidth left unused by others. when several vms are operated in this mode, unused bandwidth left by others is equally distributed among them, in a packet-based round-robin manner. the bandwidth share scheme is configured by the following set of registers: ? vmbacs, to control the general operation of the bandwidth allocation to vms feature. ? vmbammw, to set the maximum amount of tx payload compensation a vm can accumulate in case it temporarily does not use its allocated bandwidth. ? vmbasel, to select the vm / tx queue for which a bandwidth share is configured via the vmbac register. ? vmbac, to set the minimum rate allocated to a vm. 7.3 interrupts 7.3.1 mapping of interrupt causes the 82576 supports the following interrupt modes: ? pci legacy interrupts or msi - selected when gpie.multiple_msix is 0b ? msi-x in non-iov mode - selected when gpie.multiple_msix is 1b and the vfe bit in pcie sr-iov control register is cleared. ? msi-x in iov mode - selected when gpie.multiple_msix is 1b and the vfe bit in pcie sr-iov control register is set. note: if only one msi-x vector is allocated by the operating system, then the driver might use the non msi-x mapping method even in msi-x mode. mapping of interrupts causes is different in each of the above modes and is described below. 7.3.1.1 legacy and msi interrupt modes in legacy and msi modes, an interrupt cause is reflected by setting a bit in the eicr register. this section describes the mapping of interrupt causes (a specific rx queue event or a lsc event) to bits in the eicr. mapping of queue-related causes is accomplished through the ivar register. each possible queue interrupt cause (each rx or tx queue) is allocated an entry in the ivar, and each entry in the ivar identifies one bit in the eicr register among the bits allocated to queue interrupt causes. it is possible to map multiple interrupt causes into the same eicr bit.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 344 in this mode, causes can be mapped to the first 16 bits of the eicr register. interrupt causes related to non-queue causes are mapped into the icr legacy register; each cause is allocated a separate bit. the sum of all causes is reflected in the other bit in eicr. figure 7-13 below describes the allocation process. the following configuration and parameters are involved: ? the ivar[7:0] entries map 16 tx queues and 16 rx queues into eicr[15:0] bits ? the ivar_misc that maps non-queue causes is not used ? the eicr[30] bit is allocated to the tcp timer interrupt cause. ? the eicr[31] bit is allocated to the other interrupt causes summarized in the icr reg. ? a single interrupt vector is provided. the table below maps the different interrupt causes into the ivar registers. 7.3.1.2 msi-x mode ? non-iov mode in a non single root - iov setup (sr-iov capability is not exposed in the pcie configuration space), the 82576 can request up to 25 vectors. figure 7-13. cause mapping in legacy mode table 7-42. cause allocation in the ivar registers ? msi and legacy mode interrupt entry description rx_i i*4 (i= 0..7) receive queues i ? associates an interrupt occurring in the rx queues i with a corresponding bit in the eicr register. tx_i i*4+1 (i= 0..7) transmit queues i ? associates an interrupt occurring in the tx queues i with a corresponding bit in the eicr register. rx_i (i-8)*4+2 (i= 8..15) receive queues i ? associates an interrupt occurring in the rx queues i with a corresponding bit in the eicr register. tx_i (i-8)*4+3 (i= 8..15) transmit queues i ? associates an interrupt occurring in the tx queues i with a corresponding bit in the eicr register.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 345 in msi-x mode, an interrupt cause is mapped into an msi-x vector. this section describes the mapping of interrupt causes (a specific rx queue event or other events) to msi-x vectors. mapping is accomplished through the ivar register. each possible cause for an interrupt is allocated an entry in the ivar, and each entry in the ivar identifies one msi-x vector. it is possible to map multiple interrupt causes into the msi-x vector. the eicr also reflects interrupt vectors. the eicr bits allocated for queue causes reflect the msi-x vector (bit 2 is set when msi-x vector 2 is used). interrupt causes related to non-queue causes are mapped into the icr (as in the legacy case). the msi-x vector for all such causes is reflected in the eicr. the following configuration and parameters are involved: ? the ivar[7:0] entries map 16 tx queues, 16 rx queues, a tcp timer, and other events to up to 23 interrupt vectors ? the ivar_misc register maps a tcp timer and other events to 2 msi-x vectors figure 7-14 describes the allocation process. table 7-43 below defines which interrupt cause is represented by each entry in the msi-x allocation registers. figure 7-14. cause mapping in msi-x mode
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 346 in non sr-iov mode, the software has access to 34 mapping entries to map each cause to one of the 25 msi-x vectors. 7.3.1.3 msi-x interrupts in sr-iov mode each of the vf functions in pci-sig sr-iov mode is allocated 3 msi-x vectors. the pf can request up to 10 vectors. interrupt allocation for the physical function (pf) is done as in the msi-x non-iov case. however, the pf should not assign interrupt vectors to queues not assigned to it. the ivar_misc register allocates non- queue interrupts as in the non-iov case with a single change - the entry assigned to ?other? causes also handles interrupt on the mailbox. although the pf is allocated up to 10 vectors, these vectors shares the internal interrupts with the vfs. see section 7.3.3.1 for details of the sharing of the internal interrupts. each of the vfs in iov mode is allocated separate ivar registers (called vtivar), translating its queue- related interrupt causes into msi-x vectors for this virtual function. the ivar register has one entry per tx or rx queue. a vtivar_misc register is provided to map the mailbox interrupt into an msi-x vector. the pf can allocate interrupt causes not used by the vfs to one of it?s own vectors. table 7-43. cause allocation in the ivar registers ? non-iov mode interrupt entry description rx_i i*4 (i= 0..7) receive queues i ? associates an interrupt occurring in the rx queues i with a corresponding entry in the msi-x allocation registers. tx_i i*4+1 (i= 0..7) transmit queues i ? associates an interrupt occurring in the tx queues i with a corresponding entry in the msi-x allocation registers. rx_i (i-8)*4+2 (i= 8..15) receive queues i ? associates an interrupt occurring in the rx queues i with a corresponding entry in the msi-x allocation registers. tx_i (i-8)*4+3 (i= 8..15) transmit queues i ? associates an interrupt occurring in the tx queues i with a corresponding entry in the msi-x allocation registers. tcp timer 32 tcp timer ? associates an interrupt issued by the tcp timer with a corresponding entry in the msi-x allocation registers other cause 33 other causes ? associates an interrupt issued by the ?other causes? with a corresponding entry in the msi-x allocation registers
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 347 the eicr of each vf or of the pf reflects the status of the msi-x vectors allocated to this function. table 7-44 below, defines for a given vm (not pf) which interrupt cause is represented by each entry in the msi-x allocation registers. in the iov mode the software have access to 5 mapping entries to map each cause to one out of 3 msi- x vectors the 3 vm vectors (per each vm) can be allocated to one or more causes (2 q traffic interrupt, mail box interrupt). 7.3.2 registers the interrupt logic consists of the registers listed in the tables below, plus the registers associated with msi/msi-x signaling. the first table describes the use of the registers in legacy mode and the second one the use of the register when using the extended interrupts functionality figure 7-15. cause mapping of a vf in msi-x mode (iov) table 7-44. cause allocation for a vf in the vtivar registers ? iov mode interrupt entry description rx queue i (i=0...1) i*2 receive queue i ? associates an interrupt occurring in rx queue i with a corresponding entry in the msi-x allocation registers. tx queue i (i=0...1) i*2+1 transmit queue i ? associates an interrupt occurring in tx queue i with a corresponding entry in the msi-x allocation registers. table 7-45. interrupt registers ? legacy mode register acronym function interrupt cause icr records interrupt conditions. interrupt cause set ics allows software to set bits in the icr. interrupt mask set/read ims sets or reads bits in the interrupt mask. interrupt mask clear imc clears bits in the interrupt mask. interrupt acknowledge auto-mask iam under some conditions, the content of this register is copied to the mask register following read or write of icr.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 348 7.3.2.1 interrupt cause register (icr) 7.3.2.1.1 legacy mode in legacy mode, icr is used as the sole interrupt cause register. upon reception of an interrupt, the interrupt handling routine can read this register in order to find out what are the causes of this interrupt. 7.3.2.1.2 advanced mode in advanced mode, this register captures the interrupt causes not directly captured by the eicr. these are infrequent management interrupts and error conditions. note that when eicr is used in advanced mode, the rx /tx related bits in icr should be masked. icr bits are cleared on register read. if gpie.nsicr = 0b, then the clear on read occurs only if no bit is set in the ims or at least one bit is set in the ims and there is a true interrupt as reflected in icr.inta. table 7-46. interrupt registers ? extended mode register acronym function extended interrupt cause eicr records interrupt causes from receive and transmit queues. an interrupt is signaled when unmasked bits in this register are set. extended interrupt cause set eics allows software to set bits in the interrupt cause register. extended interrupt mask set/read eims sets or read bits in the interrupt mask. extended interrupt mask clear eimc clears bits in the interrupt mask. extended interrupt auto clear eiac allows bits in the eicr to be cleared automatically following an msi-x interrupt without a read or write of the eicr. extended interrupt acknowledge auto-mask eiam this register is used to decide which masks are cleared in the extended mask register following read or write of eicr or which masks are set following a write to eics. in msi-x mode, this register also controls which bits in eimc are cleared automatically following an msi-x interrupt. interrupt cause icr records interrupt conditions for special conditions ? a single interrupt from all the conditions of icr is reflected in the ?other? field of the eicr. interrupt cause set ics allows software to set bits in the icr. interrupt mask set/read ims sets or reads bits in the other interrupt mask. interrupt mask clear imc clears bits in the other interrupt mask. interrupt acknowledge auto-mask iam under some conditions, the content of this register is copied to the mask register following read or write of icr. general purpose interrupt enable gpie controls different behaviors of the interrupt mechanism.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 349 7.3.2.2 interrupt cause set register (ics) this registers allows setting the bits of icr by software, by writing a 1b in the corresponding bits in ics. used usually to rearm interrupts the software didn't have time to handle in the current interrupt routine. 7.3.2.3 interrupt mask set/read register (ims) an interrupt is enabled if its corresponding mask bit in this register is set to 1b, and disabled if its corresponding mask bit is set to 0b. a pcie interrupt is generated whenever one of the bits in this register is set, and the corresponding interrupt condition occurs. the occurrence of an interrupt condition is reflected by having a bit set in the interrupt cause register. reading this register returns which bits have an interrupt mask set. a particular interrupt might be enabled by writing a 1b to the corresponding mask bit in this register. any bits written with a 0b are unchanged. thus, if software desires to disable a particular interrupt condition that had been previously enabled, it must write to the interrupt mask clear register (see below), rather than writing a 0b to a bit in this register. 7.3.2.4 interrupt mask clear register (imc) software blocks interrupts by clearing the corresponding mask bit. this is accomplished by writing a 1b to the corresponding bit in this register. bits written with 0b are unchanged (their mask status does not change). 7.3.2.5 interrupt acknowledge auto-mask register (iam) an icr read or write has the side effect of writing the contents of this register to the mask register. if gpie.nsicr = 0b, then the copy of this register to the mask register occurs only at least one bit is set in the mask register and there is a true interrupt as reflected in icr.inta. 7.3.2.6 extended interrupt cause registers (eicr) 7.3.2.6.1 msi/int-a mode this register records the interrupts causes to provide to the software information on the interrupt source. the interrupt causes include: 1. the receive and transmit queues ? each queue (either tx or rx) can be mapped to one of the 16 interrupt causes bits (rtxq) available in this register according to the mapping in the ivar registers 2. indication for the tcp timer interrupt. 3. legacy and other indications ? when any interrupt in the interrupt cause register is active. writing 1bs clears the corresponding bits in this register. most systems have write-buffering that minimizes overhead, but this might require a read operation to guarantee that the write has been flushed from posted buffers. reading this register auto-clears all bits.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 350 7.3.2.6.2 msi-x mode this register records the interrupt vectors currently emitted. in this mode only the first 25 bits are valid. for all the subsequent registers, in msi-x mode, each bit controls the behavior of one vector. bits in this register can be configured to auto-clear when the msi-x interrupt message is sent, in order to minimize driver overhead when using msi-x interrupt signaling. 7.3.2.7 extended interrupt cause set register (eics) this registers allows to set the bits of eicr by software, by writing a 1b in the corresponding bits in eics. used usually to rearm interrupts the software didn't have time to handle in the current interrupt routine. 7.3.2.8 extended interrupt mask set and read register (eims) & extended interrupt mask clear register (eimc) interrupts appear on pcie only if the interrupt cause bit is a one and the corresponding interrupt mask bit is a one. software blocks assertion of an interrupt by clearing the corresponding bit in the mask register. the cause bit stores the interrupt event regardless of the state of the mask bit. different clear (eimc) and set (eims) registers make this register more ?thread safe? by avoiding a read-modify-write operation on the mask register. the mask bit is set for each bit written to a one in the set register (eims) and cleared for each bit written in the clear register (eimc). reading the set register (eims) returns the current mask register value. 7.3.2.9 extended interrupt auto clear enable register (eiac) each bit in this register enables clearing of the corresponding bit in eicr following interrupt generation. when a bit is set, the corresponding bit in eicr are automatically cleared following an interrupt. this feature should only be used in msi-x mode. when used in conjunction with msi-x interrupt vector, this feature allows interrupt cause recognition, and selective interrupt cause, without requiring software to read or write the eicr register; therefore, the penalty related to a pcie read or write transaction is avoided. the process of interrupt cause bits reset is described below in section 7.3.4 7.3.2.10 extended interrupt auto mask enable register (eiam) each bit set in this register enables clearing of the corresponding bit in the extended mask register following read or write-to-clear to eicr. it also enables setting of the corresponding bit in the extended mask register following a write-to-set to eics. this mode is provided in case msi-x is not used, and therefore auto-clear through eiac register is not available. in msi-x mode, the driver software might set the bits of this register to select mask bits that must be reset during interrupt processing. in this mode, each bit in this register enables clearing of the corresponding bit in eimc following interrupt generation.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 351 7.3.2.11 gpie there are a few bits in the gpie register that define the behavior of the interrupt mechanism. the setting of these bits is different in each mode of operation. the following table describes the recommended setting of these bits in the different modes: 7.3.3 msi-x and vectors msi-x defines a separate optional extension to basic msi functionality. compared to msi, msi-x supports a larger maximum number of vectors per function, the ability for software to control aliasing when fewer vectors are allocated than requested, plus the ability for each vector to use an independent address and data value, specified by a table that resides in memory space. however, most of the other characteristics of msi-x are identical to those of msi. for more information on msi-x, refer to the pci local bus specification, revision 3.0. msi-x maps each of the intel? 82576 gbe controller interrupt causes into an interrupt vector that is conveyed by the 82576 as a posted-write pcie transaction. mapping of an interrupt cause into an msi- x vector is determined by system software (a device driver) through a translation table stored in the msi-x allocation registers. each entry of the allocation registers defines the vector for a single interrupt cause. there are 34 extended interrupt causes exit in the 82576: table 7-47. settings for different interrupt modes field bit(s) initial value description int-x/ msi + legacy int-x/ msi + extend msi-x multi vector msi-x single vector nsicr 0 0b non selective interrupt clear on read : when set, every read of icr clears it. when this bit is cleared, an icr read causes it to be cleared only if an actual interrupt was asserted or ims = 0b. 0b 1 1. in systems where interrupt sharing is not expected, the nsicr bit can be set by legacy drivers also as this register affects the way the hardware interprets write to the other interrupt control registers, it should be set the to the right mode before any access to the other registers. 1b 1b 1b multipl e_msi x 40b multiple_msix - multiple vectors: 0b = non-msix or msi-x with 1 vector ivar map rx/tx causes to 16 eicr bits, but msix[0] is asserted for all 1b = msix mode, ivar maps rx/tx causes to 25 eicr bits 0b 0b 1b 0b eiame 30 0b eiame: when set, upon firing of an msi-x message, mask bits set in eiam associated with this message are cleared. otherwise, eiam is used only upon read or write of eicr/eics registers. 0b 0b 1b 1b pba_ suppor t 31 0b pba support: when set, setting one of the extended interrupts masks via eims causes the pba bit of the associated msi-x vector to be cleared. otherwise, the 82576 behaves in a way supporting legacy int-x interrupts. should be cleared when working in int-x or msi mode and set in msi-x mode. 0b 0b 1b 1b
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 352 1. 32 traffic causes ? 16 tx, 16 rx. 2. tcp timer 3. other causes ? summarizes legacy interrupts into one extended cause. the way the 82576 exposes causes to the software is determined by the iov mode. see section 7.3.1 for details. 7.3.3.1 usage of spare msi-x vectors by physical function the total number of available msi-x vector is 34. the pf should not request vectors that may be later allocated to vfs. for example, if the driver knows that at most 6 vfs will be enabled, it can request up to 34 - 3*6 = 16 vectors. in any case, the pf can request 10 vectors, even if all the vfs are allocated. however, the number of internal interrupts is only 25. thus, when vfs are enabled, the pf should release all the internal interrupt resources allocated to the vfs. the following table describes the pf vectors available according to the number of vfs enabled assuming the pf requests up to 10 vectors. the available vectors can be referenced in the ivar registers and indicates which eitr registers are available for the pf. 7.3.3.2 interrupt moderation the 82576 implements interrupt moderation to reduce the number of interrupts software processes. the moderation scheme is based on the eitr (interrupt throttle register; see section 8.8.12 ). whenever an interrupt event happens, the corresponding bit in the eicr is activated. however, an interrupt message is not sent out on the pcie interface until the eitr counter assigned to that eicr bit has counted down to zero. as soon as the interrupt is issued, the eitr counter is reloaded with its initial value and the process repeats again. the flow follows the diagram below: table 7-48. internal vectors available to the pf enabled vfs available vectors 0-5 0-9 6 0-7 7 0-3 80
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 353 for cases where the 82576 is connected to a small number of clients, it is desirable to fire off the interrupt as soon as possible with minimum latency. for these cases, when the eitr counter counts down to zero and no interrupt event has happened, then the eitr counter is not reset but stays at zero. thus, the next interrupt event triggers an interrupt immediately. that scenario is illustrated as ?case b? below. figure 7-16. interrupt throttle flow diagram figure 7-17. case a: heavy load, interrupts moderated
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 354 7.3.3.2.1 more on using eitr there is an eitr register for each msi-x vector. see also: section 8.8.12 . eitr provides a guaranteed inter-interrupt delay between interrupts asserted by the 82576, regardless of network traffic conditions. to independently validate configuration settings, software can use the following algorithm to convert the inter-interrupt interval value to the common interrupts/sec. performance metric: interrupts/sec = (1 * 10-6sec x interval) -1 a counter counts in units of 1*10 -6 sec. after counting ?interval ?number of units, an interrupt is sent to the software. the above equation gives the number of interrupts per second. the equation below time in seconds between consecutive interrupts. for example, if the interval is programmed to 125 (decimal), the 82576 guarantees the processor does not receive an interrupt for 125 ? s from the last interrupt. the maximum observable interrupt rate from the 82576 should never exceed 8000 interrupts/sec. inversely, inter-interrupt interval value can be calculated as: inter-interrupt interval = (1 * 10 -6 sec x interrupt/sec) -1 the optimal performance setting for this register is very system and configuration specific. an initial suggested range is 2 to 175 (0x02 to 0xaf). note: setting eitr to a non zero value can cause an interrupt cause rx/tx statistics miscount. 7.3.4 clearing interrupt causes the 82576 has three methods available for to clear eicr bits: autoclear, clear-on-write, and clear-on- read. icr bits might only be cleared with clear-on-write or clear-on-read. figure 7-18. light load, interrupts immediately on packet receive
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 355 7.3.4.1 auto-clear in systems that support msi-x, the interrupt vector allows the interrupt service routine to know the interrupt cause without reading the eicr. with interrupt moderation active, software load from spurious interrupts is minimized. in this case, the software overhead of a i/o read or write can be avoided by setting appropriate eicr bits to autoclear mode by setting the corresponding bits in the extended interrupt auto-clear enable register (eiac). when auto-clear is enabled for an interrupt cause, the eicr bit is set when a cause event mapped to this vector occurs. when the eitr counter reaches zero, the msi-x message is sent on pcie. then the eicr bit is cleared and enabled to be set by a new cause event. the vector in the msi-x message signals software the cause of the interrupt to be serviced. it is possible that in the time after the eicr bit is cleared and the interrupt service routine services the cause, for example checking the transmit and receive queues, that another cause event occurs that is then serviced by this isr call, yet the eicr bit remains set. this results in a ?spurious interrupt?. software can detect this case, for example if there are no entries that require service in the transmit and receive queues, and exit knowing that the interrupt has been automatically cleared. the use of interrupt moderations through the eitr register limits the extra software overhead that can be caused by these spurious interrupts. 7.3.4.2 write to clear in the case where the driver wishes to configure itself in msi-x mode to not use the ?auto-clear? feature, it might clear the eicr bits by writing to the eicr register. any bits written with a 1b is cleared. any bits written with a 0b remain unchanged. 7.3.4.3 read to clear the eicr and icr registers are cleared on a read. note that the driver should never do a read-to-clear of the eicr when in msi-x mode, since this might clear interrupt cause events which are processed by a different interrupt handler (assuming multiple vectors). 7.3.5 rate controlled low latency interrupts (lli) there are some types of network traffic for which latency is a critical issue. for these types of traffic, interrupt moderation hurts performance by increasing latency between the time a packet is received by hardware and the time it is handled to the host operating system. this traffic can be identified by the 5- tuple value, in conjunction with control bits and specific size. in addition packets with specific ethernet types, tcp flag or specific vlan priority might generate an immediate interrupt. low latency interrupts shares the filters used by the queueing mechanism described in section 7.1.1 . each of these filters, in addition to the queueing action might also indicate matching packets might generate immediate interrupt. if a received packet matches one of these filters, hardware should interrupt immediately, overriding the interrupt moderation by the eitr counter. each time a low latency interrupt is fired, the eitr interval is loaded and down-counting starts again. the logic of the low latency interrupt mechanism is as follows:
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 356 ? there are 8 5-tuples filters. the content of each filter is described in section 7.1.1.5 . the immediate interrupt action of each filter can be enabled or disabled. if one of the filters detects an adequate packet, an immediate interrupt is issued. ? when vlan priority filtering is enabled, vlan packets must trigger an immediate interrupt when the vlan priority is equal to or above the vlan priority threshold. this is regardless of the status of the 5-tuple filters. ? the syn packets filter defined in section 7.1.1.6 and the ethernet type filters defined in section section 7.1.1.4 might also be used to indicate low latency interrupt conditions. note: immediate interrupts are available only when using advanced receive descriptors and not for legacy descriptors. packets that are dropped or have errors do not cause a low latency interrupt. 7.3.5.1 rate control mechanism in a network with lots of latency sensitive traffics the low latency interrupt can eliminate the interrupt throttling capability by flooding the host with too many interrupts (more than the host can handle). in order to mitigate the above, intel? 82576 gbe controller supports a credit base mechanism to control the rate of the low latency interrupts. rules: ? the default value of each counter is 0b (no moderation). this also preserves backward compatibility. ? the counter increments at a configurable rate, and saturates at the maximum value (31d). ? the configurable rate granularity is 4 ? s (250k interrupt/sec. down to 250k/32 ~ 8k interrupts per sec.). ? a lli might be issued as long as the counter value is strictly positive (> zero). ? the credit counter allows bursts of low latency interrupts but the interrupt average are not more than the configured rate. ? each time a low latency interrupt is fired the credit counter decrements by one. ? once the counter reaches zero, a low latency interrupt cannot be fired ? must wait for the next itr expired or for the next incrementing of this counter (if the eitr expired happened first the counter does not decrement). the following fields manages rate control of lli: ? the ll interval field in the gpie register controls the rate of credits. ? the 5-bit ll counter field in the eitr register contains the credits 7.3.6 tcp timer interrupt 7.3.6.1 introduction in order to implement tcp timers for ioat, software needs to take action periodically (every 10 milliseconds). today, the driver must rely on software-based timers, whose granularity can change from platform to platform. this software timer generates a software nic interrupt, which then allows the driver to perform timer functions as part of its usual dpc, avoiding cache thrash and enabling parallelization. the timer interval is system-specific.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 357 it would be more accurate and more efficient for this periodic timer to be implemented in hardware. the driver would program a timeout value (usual value of 10 ms), and each time the timer expires, hardware sets a specific bit in the eicr. when an interrupt occurs (due to normal interrupt moderation schemes), software reads the eicr and discover that it needs to process timer events during that dpc. the timeout should be programmable by the driver, and the driver should be able to disable the timer interrupt if it is not needed. 7.3.6.2 description a stand-alone down-counter is implemented. an interrupt is issued each time the value of the counter is zero. the software is responsible for setting initial value for the timer in the duration field. kick-starting is done by writing a 1b to the kickstart bit. following kick-starting, an internal counter is set to the value defined by the duration field. then the counter is decreased by one each millisecond. when the counter reaches zero, an interrupt is issued (see eicr register section 8.8.1 ). the counter re-start counting from its initial value if the loop field is set. 7.4 802.1q vlan support the 82576 provides several specific mechanisms to support 802.1q vlans: ? optional adding (for transmits) and stripping (for receives) of ieee 802.1q vlan tags. ? optional ability to filter packets belonging to certain 802.1q vlans. 7.4.1 802.1q vlan packet format the following table compares an untagged 802.3 ethernet packet with an 802.1q vlan tagged packet. note: the crc for the 802.1q tagged frame is re-computed, so that it covers the entire tagged frame including the 802.1q tag header. also, max frame size for an 802.1q vlan packet is 1522 octets as opposed to 1518 octets for a normal 802.3z ethernet packet. table 7-49. comparing packets 802.3 packet #octets 802.1q vlan packet #octets da 6 da 6 sa 6 sa 6 type/length 2 802.1q tag 4 data 46-1500 type/length 2 crc 4 data 46-1500 crc* 4
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 358 7.4.2 802.1q tagged frames for 802.1q, the tag header field consists of four octets comprised of the tag protocol identifier (tpid) and tag control information (tci); each taking 2 octets. the first 16 bits of the tag header makes up the tpid. it contains the ?protocol type? which identifies the packet as a valid 802.1q tagged packet. the two octets making up the tci contain three fields: ? user priority (up) ? canonical form indicator (cfi). should be 0b for transmits. for receives, the device has the capability to filter out packets that have this bit set. see the cfien and cfi bits in the rctl described in section 8.10.1 . ? vlan identifier (vid) bit ordering is shown below. 7.4.3 transmitting and receiving 802.1q packets 7.4.3.1 adding 802.1q tags on transmits software might command the 82576 to insert an 802.1q vlan tag on a per packet or per flow basis. if ctrl.vme is set to 1b, and the vle bit in the transmit descriptor is set to 1b, then the 82576 inserts a vlan tag into the packet that it transmits over the wire. the tag protocol identifier (tpid) field of the 802.1q tag comes from the vet register. 8021.q tag insertion is done in different ways for legacy and advanced tx descriptors: ? legacy transmit descriptors: the tag control information (tci) of the 802.1q tag comes from the vlan field (see figure 7-9 ) of the descriptor. refer to table 7-26 , for more information regarding hardware insertion of tags for transmits. ? advanced transmit descriptor: the tag control information (tci) of the 802.1q tag comes from the vlan tag field (see table 7.2.2.2.1 ) of the advanced context descriptor. the idx field of the advanced tx descriptor should be set to the adequate context. 7.4.3.2 stripping 802.1q tags on receives software might instruct the 82576 to strip 802.1q vlan tags from received packets. if the ctrl.vme bit is set to 1b, and the incoming packet is an 802.1q vlan packet (its ethernet type field matched the vet), then the 82576 strips the 4 byte vlan tag from the packet, and stores the tci in the vlan tag field (see figure 7-5 and section 7.1.10.2 ) of the receive descriptor. the 82576 also sets the vp bit in the receive descriptor to indicate that the packet had a vlan tag that was stripped. if the ctrl.vme bit is not set, the 802.1q packets can still be received if they pass the receive filter, but the vlan tag is not stripped and the vp bit is not set. refer table 7-19 for more information regarding receive packet filtering. table 7-50. tci bit ordering octet 1 octet 2 up vid
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 359 7.4.4 802.1q vlan packet filtering vlan filtering is enabled by setting the rctl.vfe bit to 1b. if enabled, hardware compares the type field of the incoming packet to a 16-bit field in the vlan ether type (vet) register. if the vlan type field in the incoming packet matches the vet register, the packet is then compared against the vlan filter table array for acceptance the 82576 provides exact vlan filtering for vlan tags for host traffic and vlan tags for manageability traffic. the virtual lan id field indexes a 4096 bit vector. if the indexed bit in the vector is one; there is a virtual lan match. software might set the entire bit vector to ones if the node does not implement 802.1q filtering. the register description of the vlan filter table array is described in detail in section 8.10.19 . in summary, the 4096-bit vector is comprised of 128, 32-bit registers. the vlan identifier (vid) field consists of 12 bits. the upper 7 bits of this field are decoded to determine the 32-bit register in the vlan filter table array to address and the lower 5 bits determine which of the 32 bits in the register to evaluate for matching. the mc configures the 82576 with eight different manageability vids via the management vlan tag value [7:0] - mavtv[7:0] registers and enables each filter in the mfval register. two other bits in the receive control register (see section 8.10.1 ), cfien and cfi, are also used in conjunction with 802.1q vlan filtering operations. cfien enables the comparison of the value of the cfi bit in the 802.1q packet to the receive control register cfi bit as acceptance criteria for the packet. note: the vfe bit does not effect whether the vlan tag is stripped. it only effects whether the vlan packet passes the receive filter. the following table lists reception actions per control bit settings. note: a packet is defined as a vlan/802.1q packet if its type field matches the vet. figure 7-19. packet reception decision table is packet 802.1q? ctrl. vme rctl. vfe action no x x normal packet reception yes 0b 0b receive a vlan packet if it passes the standard mac address filters (only). leave the packet as received in the data buffer. vp bit in receive descriptor is cleared. yes 0b 1b receive a vlan packet if it passes the standard filters and the vlan filter table. leave the packet as received in the data buffer (the vlan tag would not be stripped). vp bit in receive descriptor is cleared. yes 1b 0b receive a vlan packet if it passes the standard filters (only). strip off the vlan information (four bytes) from the incoming packet and store in the descriptor. sets vp bit in receive descriptor. yes 1b 1b receive a vlan packet if it passes the standard filters and the vlan filter table. strip off the vlan information (four bytes) from the incoming packet and store in the descriptor. sets vp bit in receive descriptor.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 360 7.4.5 double vlan support the 82576 supports a mode where all received and sent packet have at least one vlan tag in addition to the regular tagging which might optionally be added. this mode is used for systems where the switches add an additional tag containing switching information. this mode is activated by setting ctrl_ext.extended_vlan bit. the default of this bit is set according to bit 1 in word 24h/14h of the eeprom for ports 0 and 1 respectively. the type of the vlan tag used for the additional vlan is defined in the vet.vet_ext field. 7.4.5.1 transmit behavior it is expected that the driver includes the external vlan header as part of the transmit data structure. the software may post the internal vlan header as part of the transmit data structure or embedded in the transmit descriptor (see section 7.2.2 for details). the 82576 does not relate to the external vlan header other than the capability of ?skipping? it for parsing of inner fields. note: the vlan header in a packet that carries a single vlan header is treated as the external vlan. the 82576 expects that any transmitted packet has at least the external vlan added by the software. for those packets where an external vlan is not present, any offload that relates to inner fields to the ethertype may not be provided. 7.4.5.2 receive behavior when a port of the 82576 is working in this mode, the 82576 assumes that all packets received by this port have at least one vlan, including packet received or sent on the manageability interface. one exception to this rule are flow control pause packets which are not expected to have any vlan. other packets may contain no vlan, however a received packet that does not contain the first vlan is forwarded to the host but filtering and offloads are not applied to this packet. see the table below for the supported receive processing when the device is set to ?double vlan? mode. stripping of vlan is done on the second vlan if it exists. all the filtering functions of the 82576 ignores the first vlan in this mode. the presence of a first vlan tag is indicated it in the rdesc.status.vext bit. queue assignment of the rx packets is not affected by the external vlan header. it may depend on the internal vlan, mac address or any upper layer content as described in section 7.1.1 . table 7-51. receive processing in double vlan mode vlan headers status.vext status.vp packet parsing rx offload functions external and internal 1 1 + + internal only not supported v-ext 1 0 + + none 1 0 0 + (flow control only) -
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 361 7.5 configurable led outputs the 82576 implements 4 output drivers intended for driving external led circuits per port. each lan device provides an independent set of led outputs - these pins and their function are bound to a specific lan device. each of the four led outputs can be individually configured to select the particular event, state, or activity, which is indicated on that output. in addition, each led can be individually configured for output polarity as well as for blinking versus non-blinking (steady-state) indication. the configuration for led outputs is specified via the ledctl register. furthermore, the hardware- default configuration for all the led outputs, can be specified via eeprom fields, thereby supporting led displays configurable to a particular oem preference. each of the 4 led's might be configured to use one of a variety of sources for output indication. the mode bits control the led source as described in table 7-52 . the ivrt bits allow the led source to be inverted before being output or observed by the blink-control logic. led outputs are assumed to normally be connected to the negative side (cathode) of an external led. the blink bits control whether the led should be blinked (on for 200ms, then off for 200ms) while the led source is asserted. the blink control might be especially useful for ensuring that certain events, such as activity indication, cause led transitions, which are sufficiently visible by a human eye. note: when led blink mode is enabled the appropriate led invert bit should be set to 0b. the link/activity source functions slightly different from the others when blink is enabled. the led is off if there is no link, on if there is link and no activity, and blinking if there is link and activity. the dynamic led modes (filter_activity, link/activity, collision, activity, paused) should be used with led blink mode enabled. 7.5.1 mode encoding for led outputs table 7-52 lists the mode encoding used to select the desired led signal source for each led output. 1. a few examples for packets that may not carry any vlan header may be: flow control; lacp; lldp; gmrp; 802.1x packets table 7-52. mode encoding for led outputs mode selected mode source indication 0000b link_10/1000 asserted when either 10 or 1000 mb/s link is established and maintained. 0001b link_100/1000 asserted when either 100 or 1000 mb/s link is established and maintained. 0010b link_up asserted when any speed link is established and maintained. 0011b filter_activity asserted when link is established and packets are being transmitted or received that passed mac filtering. 0100b link/activity asserted when link is established and when there is no transmit or receive activity. 0101b link_10 asserted when a 10 mb/s link is established and maintained.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 362 7.6 memory error correction and detection the 82576 main internal memories are protected by error correcting code or parity code. the larger memories are protected by an error correcting code (ecc) that can detect two errors and correct one error. the smaller memories are protected either with an error correcting code (ecc) that correct one error or a parity bit that can detect one error. correctable errors are silently corrected and are counted in the rpbeccsts.corr_err_cnt, tpbeccsts.corr_err_cnt, swpbeccsts.corr_err_cnt, ippbeccsts.corr_err_cnt, rdhests.corr_err_cnt, tdhests.corr_err_cnt, prbests.corr_err_cnt, pwbests.corr_err_cnt or pmsixests.corr_err_cnt fields according to the memory in which the error was found. part of the uncorrectable errors are counted in the rpbeccsts.uncorr_err_cnt, tpbeccsts.uncorr_err_cnt, swpbeccsts.uncorr_err_cnt, ippbeccsts.uncorr_err_cnt, rdhests.uncorr_err_cnt or tdhests.uncorr_err_cnt fields according to the memory in which the error was found. the 82576 reacts to uncorrectable error detection according to the location in which the error was found: ? if the error was detected in a receive packet data in the main rx packet buffer, the packet is sent to the host with the rxe bit set in the receive descriptor. this packet should be discarded by the host. this is considered as a non fatal error. ? if the error was detected in a transmit packet data in the main tx packet buffer, the packet is sent to the network with a wrong fcs so that the link partner can discard it. this is also, considered as a non fatal error. ? if the error was detected in the descriptors attached to receive or transmit packets in the descriptor handler cache memory, or a parity error was detected in one of the internal control memories the consistency of the receive/transmit flow can not be guaranteed any more. in this case the traffic is stopped and an interrupt is raised and the memory in which the error was detected is indicated in the peind register. the flow stop can be released only by software reset (ctrl.rst). this is considered as a fatal error. 0110b link_100 asserted when a 100 mb/s link is established and maintained. 0111b link_1000 asserted when a 1000 mb/s link is established and maintained. 1000b sdp_mode led activation is a reflection of the sdp signal. sdp0, sdp1, sdp2, sdp3 are reflected to led0, led1, led2, led3 respectively. 1001b full_duplex asserted when the link is configured for full duplex operation (de-asserted in half-duplex). 1010b collision asserted when a collision is observed. 1011b activity asserted when link is established and packets are being transmitted or received. 1100b bus_size asserted when the 82576 detects a 1-lane pcie connection. 1101b paused asserted when the 82576?s transmitter is flow controlled. 1110b led_on always high (asserted) 1111b led_off always low (de-asserted) table 7-52. mode encoding for led outputs (continued) mode selected mode source indication
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 363 ? if an error is detected in the ipsec rx sa table, the traffic is stopped and an interrupt is raised and the memory in which the error was detected is indicated in the peind register. the flow stop can be released only by software reset (ctrl.rst). this is considered as a fatal error. ? the interrupt causes used to indicate an error are icr[23:22] according to the severity of the error. note: once an interrupt indicating a memory error was asserted, the peind register must be read before a new interrupt can be asserted. the enabling of the reaction mechanism of the 82576 to uncorrectable errors for each of the memories is done using the peindm register. enablement of parity error detection is done using the peindm. parity_en field. enablement of ecc error correction for each memory is done using the ecc enable field in the rpbeccsts, tpbeccsts, swpbeccsts, ippbeccsts, rdhests, tdhests, prbests, pwbests or pmsixests registers. 7.7 dca 7.7.1 description direct cache access (dca) is a method to improve network i/o performance by placing some posted inbound writes directly within cpu cache. through research and experiments, dca has been shown to reduce cpu cache miss rates significantly. dca provides a mechanism where the posted write data from an i/o device, such as an ethernet nic, can be placed into cpu cache with a hardware pre-fetch. this mechanism is initialized upon a power good reset. a device driver for the i/o device configures the i/o device for dca and sets up the appropriate cpu id and bus id for the device to send data. the device will then encapsulate that information in pcie tlp headers, in the tag field, to trigger a hardware pre-fetch by the mch /ioh to the cpu cache. dca implementation is controlled by separated registers (rxctl and txctl) for each receive and transmit queues. in addition, a dca enable bit can be found in the dca_ctrl register, and a dca_id register can be found for each port, in order to make visible the function, device, and bus numbers to the driver. the rxctl and txctl registers can be written by software on the fly and can be changed at any time. when software changes the register contents, hardware applies changes only after all the previous packets in progress for dca has been completed. however, in order to implement dca, the 82576 has to be aware of the crystal beach version used. the software driver must initialize the 82576 to let be aware of the crystal beach version. a new register named dca_ctrl is used in order to properly define the system configuration. there are 2 modes for dca implementation: 1. legacy dca: the dca target id is derived from cpu id (similar to goshen) 2. dca 1.0: the dca target id is derived from apic id. the software driver selects one of these modes through the dca_mode register. the details of both modes are described below.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 364 7.7.2 details of implementation 7.7.2.1 pcie message format for dca figure 7-20 shows the format of the pcie message for dca. the dca preferences field has the following formats. note: all functions within a the 82576 have to adhere to the ?tag encoding? rules for dca writes. even if a given function is not capable of dca, but other functions are capable of dca, memory writes from the non-dca function must set the tag field to ?00000000?. figure 7-20. pcie message format for dca table 7-53. legacy dca systems bits name description 0 dca indication 0b: dca disabled 1b: dca enabled 4:1 dca target id the dca target id specifies the target cache for the data. 7:5 reserved reserved table 7-54. dca 1.0 systems bits name description 7:0 dca target id 0000.0000b: dca is disabled other: target core id derived from apic id. the method for this is described in dca platform architecture specification, section 7.3.1 (anacapa reference number 16802)
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 365 7.8 transmit rate limiting (trl) a rate-scheduler enforces its rate limitation on a packet by packet basis, by computing the next time the entity it controls can be served, spacing the packets from each others according to the limited rate to be achieved. the output of a rate-scheduler is whether the entity can be currently served or not. this can be viewed as if an oscillating ?on/off switch? controlled by the rate-scheduler was appended at the exit of each entity it controls. rate control is defined in terms of maximum payload rate, and not in term of maximum packet rate. it means that whenever a rate controlled packet is sent, the next time a new packet can be sent out of the same rate controlled queue is relative to the packet size of the last packet sent. the minimum spacing in time between two starts of packets sent from the same rate controlled queue is recalculated in hardware on every packet again, by using the following formula: mifs = pl x rf where: ? pl (packet length) is the layer2 length (without preamble and ipg) in bytes of the previous packet sent out of that rate controller. it is an integer ranging from 64 to 9k (at least 14 bits). ? rf = 1gb/s / target-rate (rate factor) is the ratio between the nominal link rate and the target maximum rate to achieve for that rate controlled queue. it is a decimal number ranging from 1 to 1,000 (1 mb/s minimum target rate) at least 10-bits before the hexadecimal point and 14-bits after, as required for the maximum pl by which it is multiplied. ? mifs (minimum inter frame space) is the minimum delay in bytes units, between the starting of two ethernet frames issued from the same rate controlled queue. it is an integer ranging from 76 to 9,216,012 (at least 24 bits). in spite the 8-bytes resolution provided at the internal data path, the byte-level resolution is required here to maintain acceptable rate resolution (at 1% level) for the small packets case and high rates. note: it might be that a pipeline implementation causes the mifs calculated on a transmitted packet to be enforced only on the subsequent transmitted packet. note: rate-factor is defined here relatively to a link speed of 1gb/s. however, for validation purposes only, rate-schedulers may be operated over a link run at 100mb/s. in this case, the rate-factor must be configured relatively to the link speed, replacing 1gb/s by 100mb/ s in its defining formula above. timestamps - a rate-scheduling table contains the so accumulated interval mifs, for each rate controlled descriptor-queue separately, and stored as an absolute timestamp (ts) relative to an internal free running timer. the ts value points to the time in the future at which a next data read request can be sent for that queue. for example, the time at which the trl switch is switched-on again. each time updating a timestamp we get: timestamp(new) = timestamp(old) + mifs when a descriptor queue starts to be rate controlled, the first interval mifs value is equal to 0 (ts equal to the current timer value) - without taking in account the last packet sent prior to rate control. when the ts value stored becomes equal to or smaller than the current free running timer value, it means that the switch is ?on? and that the queue starts accumulating compensation times from the past (referred as a negative ts). when the ts value stored is strictly greater than the current free running timer value, it means that the switch is off (referred as a positive ts). (currenttime) < timestamp <--> switch is ?off? (currenttime) >= timestamp <--> switch is ?on?
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 366 mmw - the ability to accumulate negative compensation times saturate to a max memory window (mmw) time backward. mmw size is configured per each traffic class via the mmw_size field of the trlmmw register, and is expressed in 1kb units of payload, ranging from 0 up to 2k units (at least 11 bits). the mmw_size configured in kb units of payload has to be converted in time interval mmw_time expressed in kb, before a new timestamp is checked for saturation. it is computed for each queue according to its associated rate-factor (rf), by using the following formula: mmw_time = mmw_size x rf note: mmw_time is rounded by default to a 1kb precision level, and it must be at least 31-bits long. hence, the timestamp byte-level values stored must be at least 32-bits long for handling properly the wrap around case, and 29-bits are required for the internal free running timer clocked once every 8-bytes. when updating a timestamp, use this formula for verification: timestamp( old ) + mifs >= (currenttime) - mmw_time and then the timestamp is updated according to the non-saturated formula: timestamp( new ) = timestamp( old ) + mifs otherwise, we enforce saturation by assigning: timestamp( new ) = (currenttime) - mmw_time + mifs non null max memory window introduces some flexibility in the way controlled rates are enforced. it is required to avoid overall throughput losses and unfairness caused by rate controlled packets over- delayed, consequently to packets inserted in between. between two rate-limited packets spaced by at least the mifs interval, non-rate-limited packets, or rate-limited packets from other rate controlled queues, can be inserted. in the case a rate controlled packet has been delayed by more time than it was required for rate control, the next mifs accumulates from the last time the queue was ?switched on? by the rate-scheduling table - and not from the current time. refer to figure 7-21 for visualizing the effect of mmw.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 367 caution: mmw_size set to zero must be supported as well. figure 7-21. minimum inter frame spacing for rate controlled frames (shown in orange)
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 368 7.9 next generation security 7.9.1 macsec macsec (or macsec, 802.1ae) is a mac level encryption/authentication scheme defined in ieee 802.1ae that uses symmetric cryptography. the 802.1ae defines aes-gcm 128 bit key as a mandatory cipher suite which can be processed by the lan controller. you need to have a macsec-ready switch in order to complete the ecosystem and make use of macsec functionality. the macsec implementation supports the following: ? gcm aes 128 bit off-load engine in the tx and rx data path that support gbe wire speed. ? both host and mc traffic can be processed by the gcm aes engines. ? support a single ca (secure connectivity association) ? single sc (secure connection) on transmit data path. ? single sc on receive data path. ? each sc supports 2 sa (security association) for seamless re-keying. ? both mc and host can act as key agreement entity (kay ? in 802.1ae spec terminology) such as control and access the off loading engine (secy in 802.1ae spec terminology) ? arbitration semaphores that indicates to whether the mc or the host acts as the kay. ? tamper resistance - when the mc acts as kay it can disable accesses from host to secy?s address space. when the host acts as the kay no protection is provided. ? provide statistic counters as listed in chapter 8.0, programming interface . ? support replay protection with replay window equal to zero. ? receive memory structure ? new macsec off load receive status indication in the receive descriptors. macsec offload must not be used with the ?legacy receive? format but rather use the ?extended receive descriptor? format. ? macsec header/tag can be posted to the kay for debug. ? support vlan header location according to ieee 802.1ae (first header inner to the macsec tag) the 82576 do not support the end station (es bit in the tci field of the sectag header is set) mode of operation in transmit or in receive. it is never set in transmit packets and incorrectly handled if received.on every place in this document the reference to mc can be replaced to me if the last one is the kay in addition to the host. the me and mc cannot act as a kay together and no switching mechanism between them is possible. 7.9.1.1 packet format macsec defines frame encapsulation format as shown below. table 7-55. legacy frame format mac da, sa vlan (optional) legacy type/ len llc data (might include ip/tcp and higher level payload) crc ? - - - - - - - - - - - - - - - - - - - - user data - - - - - - - - - - - - - - - - - - - - ?
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 369 note: a 802.3 packet with snap encapsulation will be decrypted or authenticated by the macsec engine only if the snap header is part of the macsec user data. 7.9.1.2 macsec header (sectag) format 7.9.1.2.1 macsec ethertype the macsec ethertype comprises octet 1 and octet 2 of the sectag. it is included to allow a. coexistence of macsec capable systems in the same environment as other systems b. incremental deployment of macsec capable systems c. peer secy?s to communicate using the same media as other communicating entities d. concurrent operation of key agreement protocols that are independent of the macsec protocol and the current cipher suite e. operation of other protocols and entities that make use of the service provided by the secy?s uncontrolled port to communicate independently of the key agreement state 7.9.1.2.2 tci and an table 7-56. macsec encapsulation mac da, sa macsec header (sectag) user data (optional encrypted) macsec icv (tag) crc table 7-57. sectag format macsec ethertype tci and an sl pn sci (optional) 2 bytes 1 byte 1 byte 4 bytes 8 bytes table 7-58. macsec ethertype tag type name value 802.1ae security tag macsec ethertype 88-e5 table 7-59. tci and an description bit(s) description 7 version number (v). the lan controller support only version 0. packets with other version value are discarded by the controller. 6 end station (es). when set means that the sender is an end station thus the sci is redundant, causes the sc bit to be clear. currently should be always 0b. 5 secure channel (sc). equals 1b when the sci field is active if es bit is set sc must be cleared. currently should always be 1b. 4 single copy broadcast (scb). cleared to 0b unless the sc supports epon. should be always 0b. 3 encryption (e). set to 1b when the user data is encrypted. (see note 1 below) 2 changed text (c). set to 1b if the data portion is modified by the integrity algorithm. for example, if non default integrity algorithm is used or if packet is encrypted. (see note below) 1:0 association number (an). 2-bit value defined by control channel to uniquely identify sa (keys, etc.)
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 370 note: the combination of e bit equals 1b and c bit equals 0b is reserved for kay packets. the macsec logic ignores these packets on the receive path and transfer them to kay as is (no macsec processing and no macsec header strip). the 82576 never issues a packet in which e bit is clear and c is set although it can tolerate such packets on receive. see section 7.9.1.4 for details of the handling of received packets with the c bit set. 7.9.1.2.3 short length 7.9.1.2.4 packet number (pn) the macsec engine increments it for each packet on the transmit side. the pn is used to generate the initial value (iv) for the crypto engines. when the kay is establishing a new sa it should set the initial value of pn to one. see more details on pn exhausting in section 7.9.1.5.1 . 7.9.1.2.5 secure channel identifier (sci) the sci is composed of the mac address and port number as shown in the table below. if the sc bit in tci is not set the sci is not encoded in the sectag. 7.9.1.2.6 initial value (iv) calculation the iv is the initial value used by the tx and rx authentication engines. the iv is generated from the pn and sci as described in the 802.1ae spec. 7.9.1.3 macsec management ? kay (key agreement entity) the kay management is done by the host or the bmc. see chapter 10.0 for details on the transfer of ownership between these two entities. the ownership of the macsec management is as follows: 1. initialization at power up or after wake on lan ? in most cases the mc wakes before the host thus: ? if the mc is capable to be a kay it establishes a sc (authentication and key exchange). ? if the mc is not capable to be a kay the only way for it to communicate is through vlan. this means that the switch must to support settings that allow specific vlan to bypass macsec. ? when the host is awake table 7-60. short length (sl) field description bit(s) description 7:6 reserved 0b. 5:0 short length (sl). number of octets in the secure data field from end of sectag to beginning of icv if it is less then 48 octets, else sl value is 0b. table 7-61. sci field description byte 0 byte 1 byte 2 byte 3 byte 4 byte 5 byte 6 byte 7 source mac address port number
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 371 ? if the mc acted as kay host should authenticate itself and transfer his ability to authenticate to mc in order for mc to transfer ownership over the macsec hardware. at this stage the system works in proxy mode where the host manages the secured channel while the mc piggybacks on it. ? if the mc wasn't kay the host takes ownership over the macsec hardware and establishes an sc (authentication and key exchange) the mc remains on separate vlan and all host traffic should have vlan tag. 2. host at sx state - mc active ? if mc is not kay capable then the sc should be reset by link reset or by send a logoff packet (1af) and mc can return to vlan solution (or remain in such). ? if the mc is kay capable host should notify mc that it retires kay ownership and the mc should retake it. alternatively, the mc should identify cases where the communication is broken due to lack of kay maintenance by the host and retake ownership. 3. host and mc at sx ? the active kay should reset the secured channel by link reset or sending a logoff packet (1af) in order to enable wol packet on the clear. 7.9.1.4 receive flow the 82576 might receive packets that contain macsec encapsulation as well as packets that do not include macsec encapsulation concurrently. this section describes the incoming packet classification. note: this flow assumes the rx mode is set to strict . ? examine the user data for a sectag. ? if no sectag, proceed packet with secp bit cleared in descriptor ? validate frames with a sectag ? the mpdu comprises at least 17 octets ? octets 1 and 2 compose the macsec ethertype (0x88e5) ? the v bit in the tci is clear ? if the es or the scb bit in the tci is set, then the sc bit is cleared ? bits 7 and 8 of octet 4 of the sectag are clear sl <= 0x3f ? if the c and sc bits in the tci are clear, the mpdu comprises 24 octets plus the number of octets indicated by the sl field if that is non-zero and at least 72 octets otherwise ? if the c bit is clear and the sc bit set, then the mpdu comprises 32 octets plus the number of octets indicated by the sl field if that is non-zero and at least 80 octets otherwise ? if the c bit is set and the sc bit clear, then the mpdu comprises 8 octets plus the minimum length of the icv as determined by the cipher suite in use at the receiving secy, plus the number of octets indicated by the sl field if that is non-zero and at least 48 additional octets otherwise ? if the c and sc bits are both set, the frame comprises at least 16 octets plus the minimum length of the icv as determined by the cipher suite in use at the receiving secy, plus the number of octets indicated by the sl field if that is non-zero and at least 48 additional octets otherwise ? extract and decode the sectag as specified in section 7.9.1.2 . ? extract the user data and icv as specified section 7.9.1.1 . ? assign the frame to an sa ? if valid sci use it to identify the sc
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 372 ? select sa according to an value ? if no valid sc or no valid sa found drop packet ? if sci is omitted use default sc ? select sa according to an value ? if no valid sc (or more then sc active) or no valid sa found drop packet ? perform a preliminary replay check against the last validated pn ? provide the validation function with: ? the sa key (sak) ? the sci for the sc used by the secy to transmit ? the pn ? the sectag ? the sequence of octets that compose the secure data ? the icv ? receive the following parameters from the cipher suite validation operation ? a valid indication, if the integrity check was valid and the user data could be recovered ? the sequence of octets that compose the user data ? update the replay check ? issue an indication to the controlled port with the da, sa, and priority of the frame as received from the receive de-multiplexer, and the user data provided by the validation operation note: all the references to clauses are to the ieee p802.1ae/d5.1 document from january 19, 2006. 7.9.1.4.1 macsec receive modes there are 4 modes of operation defined for macsec rx as defined by the lsecrxctrl.lsrxen field: 1. bypass (lsrxen = 00) - in this mode, macsec is not off-loaded. there is no authentication or decrypting of the incoming traffic. the macsec header and trailer are not removed and these packets are forwarded to the host or the mc according to the regular l2 mac filtering. the packet is considered as untagged (no vlan filtering). no further offloads are done on macsec packets. 2. check (lsrxen = 01) - in this mode, incoming packets with matching key are decrypted and authenticated according to the macsec tag. the macsec header and trailer might be removed from these packets and the packets are forwarded to the host or the mc according to the regular l2 filtering. additional offloads are possible on macsec packets assuming the packet was decrypted. the header is not removed from kay packets. at this mode the hw has less tight policy then the strict mode on whether forward packets or drop them. since this mode is mainly for debug purposes or to overcome first generation standard inconsistencies most of the packets are yet forwarded to higher layers with a suitable error code. the only case where packets are dropped is if c bit is set and packet failed authentication. in cases where hw failed to locate a key but still forwards the packet the sectag won?t be removed if bit 6 of lsecrxctrl is set while the icv won?t be included in the packet. 3. strict (lsrxen = 10) - in this mode, incoming packets with matching key are decrypted and authenticated according to the macsec tag. the macsec header and trailer might be removed from these packets and the packets are forwarded to the host only if the decrypting or authentication was successful. additional offloads are possible on macsec packets. the header is not removed from kay packets.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 373 note: setting rctl.sbp (store bad packets) might override this mode, as all packets are forwarded to the host - regardless of the macsec offload success 4. disable (lsrxen = 11) - in this mode, macsec is not offloaded and macsec packets are dropped. there is no authentication or decrypting of the incoming traffic. 7.9.1.4.2 receive sa exhausting ? re-keying the seamless re-keying mechanism is explained in the following example. kay establishes sc and sets sa0 as the active sa by writing the key in register macsec rx key writing the an in lsecrxsa[0] and setting the sa valid bit in the same register, this clears the frame received bit. on the first packet arrived to sa0 the frame received automatically sets. only at this time the kay can and should initiate sa1 in the same manner as for sa0. when a frame of sa1 arrives, sa0 retires and can be used for the next sa. 7.9.1.4.3 receive sa context and identification upon arrival of a secured frame the context of the sectag is verified. this context of the sectag is described in section 7.9.1.2 . in order to process the secured frame it should be associated with one of the sa keys. the identification is done by comparing the sci data with macsec rx sc registers to ensure that the frame belongs to the sc. the incoming frame an field is compared to the an field of the link rx sa register of the sc in order to select an sa. the selected sa pn (register macsec rx sa pn) field is compared to the incoming pn which should be equal or greater then the macsec rx sa pn value, otherwise this frame is dropped. on a match the selected sa key is used for the secured frame processing. 7.9.1.4.4 receive statistic counters detailed list and description of the macsec rx statistics counters can found in section 8.0, programming interface . 7.9.1.5 transmit flow the 82576 might transmit packets that contain macsec encapsulation as well as packets that do not include macsec encapsulation concurrently. this section describes the transmit packet classification, transmit descriptors and statistic counters. note: since flow control (pause) packets are part of the mac service they should not go through the macsec logic. 1. assign the frame to an sa by adding the an according to sa select bit in lsectxsa register. 2. assign the nextpn variable for that sa to be used as the value of the pn in the sectag based on the value in the appropriate (according to sa) lsectxpn register. 3. encode the octets of the sectag according to the setting in lsectxctrl register. 4. provide the protection function of the current cipher suite with: a. the sa key (sak). b. the sci for the sc used by the secy to transmit. c. the pn. d. the sectag. e. the sequence of octets that compose the user data.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 374 5. receive the following parameters from the cipher suite protection operation: a. the sequence of octets that compose the secure data. b. the icv. 6. issue a request to the transmit multiplexer with the destination and source mac addresses, and priority of the frame as received from the controlled port, and an mpdu comprising the octets of the sectag, secure data, and the icv concatenated in that order. 7.9.1.5.1 transmit sa exhausting ? re-keying the 82576 supports a single sc on the transmit data path with seamless re-keying mechanism. the sc might act with one of two optional sas. the sa is selected statically by the active sa field in the lsectxsa register. once the kay entity (could be either software or firmware as defined by the macsec ownership field in the fwsm register) changes the setting of the sa select field in the lsextxsa register the active sa field is getting the same value on a packet boundary. the next packet that is processed by the transmit macsec engine uses the updated sa. the kay should switch between the two sas before the packet number (pn) is exhausted. in order to protect against such event the hardware generates a ?macsec packet number? interrupt to kay when the pn reaches the exhaustion threshold as defined in the lsectxctrl register. the exhaustion threshold should be set to a level that enables the kay to switch between sa?s faster then the pn might be exhausted. if the kay is slower than it should be, then the pn might be increment above planned. the hardware guarantees that the pn never repeats itself, even if the kay is ?slow?. once the pn reaches a value of 0xff?ffef the hardware clears the enable tx macsec field in the lsectxctrl register to 00b. clearing the enable tx macsec field the hardware disables macsec off-load before the pn could wraparound and then might repeat itself. note: potential race conditions are possible as follow. the lan controller might fetch a transmit packet (indicated as txpacketn) from the host memory (host or mc packet). kay can change the setting of the tx sa index. the txpacketn might use the new tx sa index if the tx sa index was updated before the txpacketn propagated to the transmit macsec engine. this race is not critical since the receiving node should be able to process the previous sa as well as the new sa in the re- keying transition period. 7.9.1.5.2 transmit sa context upon transmission of a secured frame the sa associated data is inserted into the sectag field of the frame. the sectag data is composed from the macsec tx registers. the sci value is taken from macsec tx sci low and high registers unless instructed to omit sci. the an value is taken from the active macsec tx sa and the pn from the appropriate macsec tx sa pn. 7.9.1.5.3 transmit statistic counters detailed list and description of the macsec tx statistics counters can found in section 8.0, programming interface .
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 375 7.9.1.6 manageability engine/ host relations the lan controller supports a single ca for all the traffic that it handles. at a given time host might be active or inactive as well the bmc. it is expected that when only mc is enabled it acts as the kay controlling the secured channel. the host can act as the kay when it is functional and the control switch was executed. the following section describes the semaphore between mc and host controlling macsec setting and its tamper resistance (protection) mechanism. 7.9.1.6.1 key and tamper protection macsec provides the network administrator protection to the network infrastructure from hostile or unauthorized devices. since the local host operating system might itself be compromised, the hardware protects vital macsec context from software access. there are two levels of protection: ? disable host read access to the macsec keys (keys are write-only), ? disable host access to macsec logic while the firmware manages the macsec secure channel (sc). 7.9.1.6.2 key protection the macsec keys are protected against read accesses at all times. both software and firmware are not able to read back the keys that the hardware uses for transmit and receive activity. instead, the hardware enables the software and firmware reading a signature enabling to verify proper programming of the device. the signature is a simple byte xor operation of the tx and rx keys readable in the lsectxsum and lsecrxsum fields in the lsectxcap and lsecrxcap registers. 7.9.1.6.3 tamper protection in a scenario where the host failed authentication thus can not act as the kay the mc disables the host access to network and manages the macsec channel while host operating system is already up and running. in such cases, the hardware provides the required hooks to protect macsec connectivity against hostile software. the mc firmware can disable write accesses generated by the host cpu (on the pci interface) by setting the lock macsec logic bit (bit 0) in the lswfw register. 7.9.1.6.4 macsec control switch between firmware and software the stages to switch macsec control ownership between mc and the host are described in chapter 10.0 . the owner after the switch procedure must assume all kay needed responsibility. 7.9.1.7 manageability flow 7.9.1.7.1 initialization in the manageability case the main difference in initialization is that in some cases the host is in off mode. in such cases the bmc should do the authentication, macsec and sgt acquirement by its own. when the host is on it is the host responsibility to acquire the sgt values for the bmc. it is assumed that the bmc will use only one sgt on the tx side so no table is needed only one sgt register. on the rx side the table holds one vector for the bmc at the same manner as an additional queue. when the host is off it is the bmc responsibility to initialize the hw tables also for the host entry (disable traffic in both directions).
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 376 7.9.1.7.2 operation flow since it is assumed that the manageability traffic will be assigned only one sgt. the sgt value that the hw will add to the cmd tag is stored in register ctstxctl. on the receive side the ctsrxmngt table is used for filtering traffic. 7.9.1.8 switching ownership between host and manageability. since it is assumed that cts will never be activated without macsec the cts ownership is tightly coupled with macsec ownership. in other words the entity that owns the macsec logic also owns the cts tagging. 7.9.2 ipsec support note: this section defines the hardware requirements for the ipsec off-load ability included in the 82576. ipsec off-load is the ability to handle in hardware a certain amount of the total number of ipsec flows, while the remaining are still handled by the operating system. it is the operating system responsibility to submit to hardware the most loaded flows, in order to take maximum benefits of the ipsec off-load in term of cpu utilization savings. the establishment of the ipsec security associations between peers is outside the scope of this document, since it always is handled by the operating system. in general, the requirements on the driver or on the operating system for enabling ipsec off-load are not detailed here. when an ipsec flow is handled in software, since the packet might be encrypted and the integrity check field already valid (ipv4 options might be present in the packet together with ipsec headers) the 82576 processes it like it does for any other unsupported layer4 protocol, and without performing on it any layer4 offload. 7.9.2.1 related rfcs and other references ? rfc4106 ? the use of galois/counter mode (gcm) in ipsec encapsulating security payload (esp) ? rfc4302 ? ip authentication header (ah) ? rfc4303 ? ip encapsulating security payload (esp) ? rfc4543 ? the use of galois message authentication code (gmac) in ipsec esp and ah ? gcm spec ? mcgrew, d. and j. viega, ?the galois/counter mode of operation (gcm)?, submission to nist. http://csrc.nist.gov/cryptotoolkit/modes/proposedmodes/gcm/gcm-spec.pdf, january 2004. 7.9.2.2 hardware features list 7.9.2.2.1 main features ? off-load ipsec for up to 256 security associations (sa) for each side separately, tx and rx. ? on-chip storage for both tx and rx sa tables ? tx sa index is conveyed to hardware via tx context descriptor ? rx sa lookup is a deterministic search according to a search key made of spi, destination ip address, and ip version type (ipv6 or ipv4)
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 377 ? performance in rx: update the whole rx sa table in less than 1msec, while receiving back-to- back 64-bytes packets ? ipsec protocols: ? ip authentication header (ah) protocol for authentication ? ip encapsulating security payload (esp) for authentication only ? ip encapsulating security payload (esp) for both authentication and encryption, only if using the same key for both ? crypto engines: ? for ah or esp authentication only use aes-128-gmac (128-bit key) ? for esp encryption and authentication use aes-128-gcm (128-bit key) ? ipsec encapsulation mode: transport mode ? in tx, packets are provided by software already encapsulated with a valid ipsec header (for ah with blank icv inside), and ? for esp single send, with a valid esp trailer and esp icv (blank icv) ? for esp large send, without esp trailer and without esp icv ? in rx, packets are provided to software encapsulated with their ipsec header and for esp with the esp trailer and esp icv, ? where up to 255-bytes of incoming esp padding is supported, for peers that prefer hiding the packet length ? ip versions: ? ipv4 packets that do not include any ip option ? ipv6 packets that do not include any extension header (other than ah/esp extension header) ? rx statuses reported to software via rx descriptor: ? packet type: ah/esp ? ipsec off-load done (sa match) ? one rx error reported to software via rx descriptor in the following precedence order: no error, invalid ipsec protocol, packet length error, authentication failed 7.9.2.2.2 cross features ? w/ segmentation: full coexistence (tcp/udp packets only) ? increment ipsec sequence number (sn) and initialization vector (iv) on each additional segment ? w/ checksum off-load: full coexistence (tx and rx) ? ip header checksum ? tcp/udp checksum ? w/ ip fragment: no ipsec offload done on ip fragments ? w/ rss: full coexistence, hash on the same fields used without ipsec (either 4-tuples or 2-tuples) ? w/ macsec off-load: ? a device interface is operated in either macsec off-load or ipsec off-load mode, but not the both altogether
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 378 ? if both ipsec and macsec encapsulations are required on the same packets, the device interface is operated in macsec off-load mode, while ipsec is performed by the operating system ? w/ virtualization: ? full coexistence in next generation vmdq mode ? in iov mode, all ipsec registers are owned by the vmm/pf. ipsec can be used for vmotion traffic (for instance). ? no coexistence with vm to vm switch, ipsec packets handled in hw are not looped back by the 82576 to another vm. tx ipsec packets destined to a local vm must be handled in sw and looped back via the sw switch. anti-spoofing check is however performed on any ipsec packet. ? w/ 9500 byte jumbo frames: full coexistence ? w/ 802.1x: no interaction ? w/ teaming: no interaction ? w/ timesync: ? timesync 1588v1 udp packets must not be encapsulated as ipsec packets ? no interaction with timesync 1588v2 layer2 packets ? w/ layer2 encapsulation modes: ? ipsec offload is not supported for flows with snap header ? ipsec offload will coexist with double-vlan encapsulations ? tunneled ipsec packets in receive: ipsec offload supported, but no other layer4 offload performed ? w/ nfs: nfs packets encapsulated over esp packets (whether offloaded or not) are not recognized ? w/ sctp offload: no sctp crc32 off load performed on received esp packets (even those handled by hw), but sctp offload performed on any ipsec packet. ? w/ manageability traffic: ipsec offload ability is controlled exclusively by the host, and because of an implementation limitation, no ipsec offload is possible on tx mng packets. for ipsec flows handled by software, ? if manageability and host entities share some ip address(es), then manageability should coordinate any use of ipsec protocol with the host. note it should be true for previous devices that do not offer ipsec offload. ? if manageability and host entities have totally separate ip addresses, then manageability can use ipsec protocol (as long as it is handled by the mc software). ? w/ header split/replication: ? for sas handled in hardware, ip boundary split is done before the ipsec header ? for sas handled in software, no header split/replication done ? w/ 5-tuple rx filters: ? esp packets recognizes only tcp, udp, and sctp protocols in ftqf registers 7.9.2.3 software/hardware demarcation the followings items are not supported by the hardware but might be supported by operating system/ driver: ? multicast sas ? ipsec protocols:
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 379 ? both ah and esp protocols on the same sa or packet ? esp for encryption only ? esp for both authentication and encryption using different keys and/or different crypto engines ? crypto engines: ? aes-256, sha-1, aes-128-cbc, or any other crypto algorithm ? tx ipsec packets encapsulated in tunnel mode ? extended sequence number (esn) ? ip versions: ? ipv4 packets that include ip option ? ipv6 packets that include extension headers other than the ah/esp extension headers ? anti-replay check and discard of incoming replayed packets ? discard of incoming ?dummy? esp packets (packets with protocol value 59) ? ipsec packets that are ip fragments ? esp padding content check ? ipsec statistics ? ipsec for flows with snap header note: for sctp and other layer4 header types, or for tunneled packets, hw should not care what is there when doing rx ipsec processing. everything after the ip/ipsec headers may be opaque to hw - just think of it as ip payload. it is fine to do ipsec processing on any packet that has a matching sa and appropriate ip options/extension headers. there is no expectation that hw figure out what is in the packet beyond the ip/ipsec headers before decrypting/authenticating the packet. the most important point is that hw should not corrupt or drop incoming ipsec packets - in any situation. when hw decides and start performing ipsec offload on a packet, it should pursue the offload until packet's end - at the price of eventually not doing other layer3/4 off loads on it. it is always legitimate for hw not to start doing the ipsec offload on a matched sa, if it knows it is an unsupported encapsulation - i.e. one of the 3 cases: ipv4 option, ipv6 extensions, or snap. 7.9.2.4 ipsec formats exchanged between hardware and software this section deals with the ipsec packet encapsulation formats used between software and hardware by ipsec packet concerned with the off-load in either tx or rx direction. in rx direction, the ipsec packets are delivered by hardware to software encapsulated as they were received from the line, whether ipsec off-load was done or not, and when it was done, whether authentication/decrypting has succeeded or failed. 7.9.2.4.1 single send in tx direction, single send ipsec packets are delivered by software to hardware already encapsulated and formatted with their valid ipsec header and trailer contents, as they should be run over the wire - excepted to the icv field that is filled with zeros, and to the esp payload destined to be encrypted that is provided in clear text before any encryption. 7.9.2.4.2 single send with tcp/udp checksum offload
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 380 for single send esp packets with tcp/udp checksum off-load, the checksum computing must include the tcp/udp header and payload before hardware encryption occurred and without the esp trailer and esp icv provided by software. software provides the length of the esp trailer plus esp icv in a dedicated field of the tx context descriptor ( ips_esp_len field) to allow hardware know when to stop tcp/udp checksum computing. software calculates a full checksum for the ip pseudo-header as in the usual case. the protocol value used in the ip pseudo-header must be the tcp/udp protocol value and not the ah/esp protocol value that appears in the ip header. this full checksum of the pseudo-header is placed in the packet data buffer at the appropriate offset for the checksum calculation. the byte offset from the start of the dma'd data to the first byte to be included in the tcp/udp checksum. for example, the start of the tcp header, is computed as in the usual case: maclen+iplen. it assumes that iplen provided by software in the tx context descriptor is the sum of the ip header length with the ipsec header length. note: for the ipv4 header checksum off-load, hardware could no more rely on the iplen field provided by software in the tx context descriptor, but should rely on the fact that no ipv4 options is present in the packet. consequently, for ipsec off-load packets hardware computes the ip header checksum over always a fixed amount of 20-bytes. 7.9.2.4.3 large send tcp/udp in tx direction, large send ipsec packets are delivered by software to hardware already encapsulated and formatted with only their valid ipsec header contents - excepted to the icv field included in ah packets headers that is filled with zeros, and to the esp payload destined to be encrypted that is provided in clear text before any encryption. no esp trailer or esp icv are appended to the large send by software. it means that hardware has to append the esp trailer and esp icv on each segment by itself, and to update ip total length / ip payload length accordingly. the next header of the esp trailer to be appended by hardware is taken from tucmd.l4t field of the tx context descriptor. by definition large send segmentation requires on each segment that the ip total length / ip payload length be updated, and the ip header checksum and tcp/udp checksum be re-computed. but for the large send of ipsec packets, the sn and the iv fields must be increased by one in hardware on each new segment (after the first one) as well. caution: driver should not offload a large send that will cause sn and/or iv field to wrap-around in hw. software calculates a partial checksum for the ip pseudo-header as in the usual case. the protocol value used in the ip pseudo-header must be the tcp/udp protocol value and not the ah/esp protocol value that appears in the ip header. this partial checksum of the pseudo-header is placed in the packet data buffer at the appropriate offset for the checksum calculation. the byte offset from the start of the dma'd data to the first byte to be included in the tcp/udp checksum. for example, the start of the tcp/udp header, is computed as in the usual case: maclen+iplen. it assumes that iplen provided by software in the tx context descriptor is the sum of the ip header length with the ipsec header length. for large send esp packets, the tcp/udp checksum computing must include the tcp/udp header and payload before hardware encryption occurred and without the esp trailer and esp icv appended by hardware. hardware must stop tcp/udp checksum computing after the amount of bytes given by
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 381 l4len + mss. it is assumed that mss value placed by software in the tx context descriptor specifies the maximum tcp/udp payload segment sent per frame, not including any ipsec header or trailer - and not including the tcp/udp header. note: for ipv4 header checksum computing, refer to the note in section section 7.9.2.4.2 . shaded fields in the figures below correspond to fields that need to be updated per each segment. figure 7-22. ipv4 large send esp packet provided by software 0 3 4 7 8 15 16 19 23 24 31 1 ver hlen tos ip total length 2 identification fla gs fragment offset 3 ttl protocol = esp header checksum 4 source ipv4 address 5 destination ipv4 address 1 security parameter index (spi) 2 sequence number (sn) 3 initialization vector (iv) 4 1 tcp/udp header tcp/udp payload 0 3 4 7 8 15 16 23 24 31 1 ver priority flow label 2 ip payload length next hdr = esp hop limit
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 382 figure 7-23. ipv6 large send esp packet provided by software 7.9.2.5 tx sa table ipsec off-load is supported only via advanced transmit descriptors. see section 7.2.2 for details. 7.9.2.5.1 tx sa table structure the tx sa table contains the extra info required by the aes-128 crypto engine to authenticate and encrypt the data. this info is not run over the wire together with the ipsec packets, but it is rather exchanged between the ipsec peers? operating system during the security association establishment process. when the ike software does a key computation it computes 4 extra bytes using a pseudo- random function, i.e it generates 20 bytes instead of 16 bytes that it needs to use as a key ? and the last 4 bytes are used as salt value. 3 source ipv6 address 4 5 6 7 destination ipv6 address 8 9 10 1 security parameter index (spi) 2 sequence number (sn) 3 initialization vector (iv) 4 1 tcp/udp header tcp/udp payload 0 3 4 7 8 15 16 23 24 31
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 383 the sa table in tx is a 256 x 20-bytes table loaded by software, each line in the table contains the following fields: refer to section 7.9.2.7 for a description of the way these fields are used by the aes-128 crypto engine. whenever an unrecoverable memory error occurs when accessing the tx sa tables, an nfer interrupt is generated in the icr register, the index of the corrupted sa entry is reported via the peind register, and the transmit path is stopped until the host resets the device. packets that have already started to be transmitted on the wire is sent with a wrong crc. upon reset, the 82576 clears the contents of the tx sa table. 7.9.2.5.2 access to tx sa table 1. software writes the ipstxidx register. 2. software writes the ipstxkey 0..3 first, and then write ipstxsalt at the end, as it is used internally to trigger the write of the whole entry into the tx sa table. read access to these registers can be done in any order. 3. hardware issues a write/read command to/from the sa table copying/reading the ipstxkey (16 bytes) and the ipstxsalt (4 bytes) to/from the sa table using the index in the ipstxidx. 4. software starts/resumes sending ipsec off-load packets with sa_idx pointing to valid/invalid sa entries. a valid sa entry contains updated key and salt fields currently in use by the ipsec peers. 7.9.2.6 tx hardware flow 7.9.2.6.1 single send without tcp/udp checksum offload: 1. extract ipsec off-load request from the ipsec bit of the popts field in the advanced tx transmit data descriptor. 2. if ipsec off-load is required for the packet ( ipsec bit was set), then extract the sa_idx, encryption, and ipsec_type fields from the tx context descriptor associated to that flow. 3. fetch the aes-128 key and salt from the tx sa entry indexed by sa_idx, and according to the encryption and ipsec_type bits determine which ipsec off-load to perform. 4. for ah, zero the mutable fields 5. compute icv and encrypt data (if required for esp) over the appropriate fields according to the operating rules described in section 7.9.2.7 , and making use of the aes-128 key and salt fields fetched at step 3. 6. insert icv at its appropriate location and replace the plaintext with the ciphertext (if required for esp). 7.9.2.6.2 single send with tcp/udp checksum offload: 1. extract ipsec off-load command from the ipsec bit of the popts field in the advanced tx transmit data descriptor. table 7-62. tx sa table aes-128 key aes-128 salt 16 bytes 4 bytes
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 384 2. if ipsec off-load is required for the packet ( ipsec bit was set), then extract the sa_idx, encryption, ipsec_type, and ips_esp_len fields from the tx context descriptor associated to that flow. 3. fetch the aes-128 key and salt from the tx sa entry indexed by sa_idx, and according to the encryption and ipsec_type bits determine which ipsec off-load to perform. 4. compute the byte offset from the start of the dma'd data to the first byte to be included in the checksum (the start of the tcp header, as specified in section 7.9.2.4.2 ). 5. compute tcp/udp checksum until either the last byte of the dma data or for esp packets, up to ips_esp_len bytes before it. like for the usual case, implicitly pad out the data by one zeroed byte if its length is an odd number. 6. sum the full checksum of the ip pseudo header placed by software at its appropriate location with the tcp/udp checksum computed at step 5. overwrite the checksum location with the one?s complement of the sum. 7. for ah, zero the mutable fields 8. compute icv and encrypt data (if required for esp) over the appropriate fields according to the operating rules described in section 7.9.2.7 , and making use of the aes-128 key and salt fields fetched at step 3. 9. insert icv at its appropriate location and replace the plaintext with the ciphertext (if required for esp). 7.9.2.6.3 large send tcp/udp: 1. extract ipsec off-load command from the ipsec bit of the popts field in the advanced tx transmit data descriptor. 2. if ipsec off-load is required for the packet ( ipsec bit was set), then extract the sa_idx, encryption, and ipsec_type fields from the tx context descriptor associated to that flow. 3. fetch the aes-128 key and salt from the tx sa entry indexed by sa_idx, and according to the encryption and ipsec_type bits determine which ipsec off-load to perform. 4. fetch the packet header from system memory, up to iplen+l4len bytes from the start of the dma'd data. 5. overwrite the tcp sequence number with the stored accumulated tcp sequence number (if it is not the first segment). 6. fetch (next) mss bytes (or the remaining bytes up to paylen for the last segment) from the system memory and form the segment formed by packet header and data bytes, while storing the accumulated tcp sequence number. 7. compute the byte offset from the start of the dma'd data to the first byte to be included in the checksum, i.e the start of the tcp/udp header, as specified in section 7.9.2.4.3 . 8. compute tcp/udp checksum until the last byte of the dma data. like for the usual case, implicitly pad out the data by one zeroed byte if its length is an odd number. 9. for both ipv4 and ipv6, hardware needs to factor in the tcp/udp length (typically l4len+mss) to the software supplied pseudo header partial checksum. then it sums the so obtained full checksum of the ip pseudo header with the tcp/udp checksum computed at step 7. overwrite the tcp/udp checksum location with the one?s complement of the sum. 10. increment by one the ah/esp sn and iv fields on every segment (excepted to the first segment), and store the updated sn and iv fields with other temporary statuses stored for that large send (one large send set of statuses per tx queue). 11. for esp, append the esp trailer: 0-3 padding bytes, padding length, and next header = tcp/udp protocol value, in a way to get the 4-bytes alignment as described in section 7.9.2.4.3 .
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 385 12. compute the ip total length / ip payload length and compute ipv4 header checksum as described in the note of section section 7.9.2.4.1 . place the results in their appropriate location. 13. for ah, zero the mutable fields 14. compute icv and encrypt data (if required for esp) over the appropriate fields according to the operating rules described in section 7.9.2.7 , and making use of the aes-128 key and salt field fetched at step 3. 15. insert icv at its appropriate location and replace the plaintext with the ciphertext (if required for esp). 16. go back to step 4 to process the next segment (if necessary). 7.9.2.7 aes-128 operation in tx the aes-128-gcm crypto engine is used for ipsec. it is the same aes-128-gcm crypto engine as is used for macsec. it is referred throughout the document as an aes-128 black box, with 4-bit string inputs and 2-bit string outputs, as shown in the figure below. refer to the gcm spec for the internal details of the engine. the difference between ipsec and macsec, and between different ipsec modes resides in the set of inputs presented to the box. ? key ? 128-bits aes-128 key field (secret key) stored for that ipsec flow in the tx sa table: key = aes-128 key ? nonce ? 96-bits initialization vector used by the aes-128 engine, which must be distinct for each invocation of the encryption operation for a fixed key. it is formed by the aes-128 salt field stored for that ipsec flow in the tx sa table, appended with the initialization vector (iv) field included in the ipsec packet: nonce = [aes-128 salt, iv] the nonce, also confusingly referred as iv in the gcm spec, is broken into two pieces - a fixed random part ?salt? and increasing counter part iv, so the salt value goes with the packet as the fixed part. the purpose behind using the ?salt? value is to prevent offline dictionary-type attacks in hashing case, to prevent predictable patterns in the hash. ? aad ? additional authentication data input, which is authenticated data that must be left un- encrypted. figure 7-24. aes-128 crypto engine box
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 386 ? plaintext ? data to be both authenticated & encrypted. ? ciphertext ? encrypted data, whose length is exactly that of the plaintext. ? icv ? 128-bits integrity check value (referred also as authentication tag). ? h ? is internally derived from the key. note: the square brackets in the formulas is used as a notation for concatenated fields. 7.9.2.7.1 aes-128-gcm for esp ? both authenticate and encrypt aad = [spi, sn] plaintext = [tcp/udp header, tcp/udp payload, esp trailer] note: unlike other ipsec modes, in this mode, iv field is used only in the nonce, and it is not included in either the plaintext or the aad inputs. esp trailer does not include the icv field. 7.9.2.7.2 aes-128-gmac for esp ? authenticate only aad = [spi, sn, iv, tcp/udp header, tcp/udp payload, esp trailer] plaintext = [] = empty string, no plaintext input in this mode note: esp trailer does not include the icv field. 7.9.2.7.3 aes-128-gmac for ah ? authenticate only aad = [ip header, ah header, tcp/udp header, tcp/udp payload] plaintext = []= empty string, no plaintext input in this mode note: both ip header and ah header contain mutable fields that must be zeroed prior to be entered into the engine. among other fields, the ah header always includes spi, sn, and iv fields. 7.9.2.8 rx descriptors ipsec off-load is supported only via advanced receive descriptors. see section 7.1.5 for details. 7.9.2.9 rx sa table 7.9.2.9.1 rx sa table structure the rx sa table contains the extra info required by the aes-128 crypto engine to authenticate and decrypt the data. this info is not run over the wire together with the ipsec packets, but it is rather exchanged between the ipsec peers? operating system during the security association establishment process. when the ike software does a key computation it computes 4 extra bytes using a pseudo- random function, i.e it generates 20 bytes instead of 16 bytes that it needs to use as a key ? and the last 4 bytes are used as salt value. the spi is allocated by the receiving operating system in a unique manner. however, in a virtualized context, guest operating systems can allocate spi values that collide with the spi values allocated by the vmm/pf. consequently, for enabling ipsec off-load at least for the vmm/pf flows in such situations, the spi search must be completed by comparing the destination ip address with the ip address of the vmm/pf. also, handling a separate table for storing the ip addresses of the vmm/pf would cost a similar
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 387 die-size than storing an ip address per each sa entry, which is the selected architecture. by doing so, guest operating system could now use the proposed ipsec off-load in rx, suffice it their sas are configured via the vmm/pf. it is assumed that refreshing the sas would be done once every several minutes, and would thus not overload the vmm/pf. the sa table in rx is a 256 x 40-bytes table loaded by software but lines are reordered by hardware. each line in the table looks as follow: ipsec mode field contains the following bits taken from ipsrxcmd register: ? ipv6, decrypt, and proto to allow performing on it a binary search, the table above is sorted by hardware with the use of the concatenated [spi, ip address, ipv6 bit] sort/search key: ? after reset or after a delete_all command, the hardware deletes all the entries from the rx table. ? the software can add/delete an entry (one at a time) ? using a lock mechanism. ? the hardware keeps the table sorted upon any entry addition/deletion. ? when the hardware is in a middle of deleting or adding an entry, since it uses a temporary copy of the entry it can still process incoming traffic and can do the lock up in a higher priority. ? hardware tracks the number of used sa for debug purpose and for eventually limiting the look up search only into the used sa (for example if only just 11 sa are used, gurney to have them at the beginning of the table, the hardware starts the look up in the first 16 entries only ? that takes 4 cycles instead of 8). in the normal operating mode, the table is handled by the hardware, and software can only add/delete entries - without directly accessing the table contents. for debugging purposes, a direct read access to the table is provided. the debugging mode is enabled by setting the dbg_mod bit in the ipsrxidx register, and this access mode has precedence over the normal access mode. whenever an unrecoverable memory error occurs when accessing the rx sa table, an nfer interrupt is generated in the icr register, the index of the corrupted sa entry is reported via the peind register, and the receive path is stopped until the host resets the device. upon reset, the 82576 clears the contents of the rx sa table. 7.9.2.9.2 normal access to rx sa table 1. software polls the ipsrxcmd.busy bit (read access). 2. if the bit is cleared software writes to the ipsrxkey 0..3, ipsrxsalt, ipsrxspi, and ipsrxipaddr 0..3 registers with the relevant sa fields to be added or deleted. 3. software writes to the ipsrxcmd register. setting the sa mode and the right value in the add_del bit. this write triggers the hardware to start insertion/deletion of the entry into the sa table. 4. hardware searches the appropriate location of the sa entry based on the spi, ip address and ip version bit (ipv6/ipv4). 5. hardware issues a add/delete process of this entry and set the busy bit. table 7-63. rx sa table spi ip address ipsec mode aes-128 key aes-128 salt 4 bytes 16 bytes 1 byte 16 bytes 4 bytes
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 388 a. add ? hardware pushes all the entries below the found location, and inserts the new one, also increments the used_sa field. b. delete ? hardware pops all the entries below the founded entry and override it, also decrements the used_sa field. c. during the add/delete process hardware stores the intermediate sa. when incoming packet is processed, the add/delete process halts, and a different lookup process takes place using the current packet parameters (spi, ip address and ip version). this look up starts just if there is no match with the intermediate sa. 6. hardware clears the busy bit ? software can issue a new access. 7.9.2.9.3 debugging read access to rx sa table 1. software polls the ipsrxcmd.busy bit (read access). 2. if the bit is cleared, software writes the ipsrxidx register with the index in the rx sa to be read, while enabling the debugging access mode via setting the dbg_mod bit. 3. software read the ipsrxkey 0..3, ipsrxsalt, ipsrxspi, ipsrxipaddr 0..3, and ipsrxcmd registers. 4. hardware issues a read command to/from the sa table reading the registers from the sa entry indexed by the ipsrxidx content. 5. software disables the debugging access mode (by clearing the dbg_mod) or performs another read access from step 1 above. note that incoming packets can still be processed while in the debugging read access mode. 7.9.2.10 rx hardware flow without tcp/udp checksum offload 1. detect the presence of an ipsec header and determine its type ah/esp. 2. if an ipsec header is present (as announced by the ip protocol field for ipv4 or by the next header for ipv6), then extract the spi, destination ip address, and ip protocol (ipv4 or ipv6), and use these fields as the search key in the rx sa table. also report the ipsec protocol found in the security bits of the extended status field in the advanced rx descriptor. 3. if an sa entry matches with the search key, then fetch the ipsec rx mode from the sa entry, and according to the proto and decrypt bits determine which ipsec off-load to perform. also, set the secp bit of the extended status field in the advanced rx descriptor. if there was no sa match, then clear the secp bit, report no error in security error bits of the extended errors field in the advanced rx descriptor, and stop processing the packet for ipsec. 4. if the proto field recorded in the rx sa table does not match the ip protocol field (next header for ipv6) seen in the packet, then report it via the security error bits of the extended errors field in the advanced rx descriptor, and stop processing the packet for ipsec. 5. fetch the aes-128 key and salt from the matched rx sa entry. 6. for ah, zero the mutable fields 7. check ah/esp header is not truncated, and for esp, check whether the packet is 4-bytes aligned. if it is not report it via the security error bits of the extended errors field in the advanced rx descriptor, but processing of the packet for ipsec may be completed (if it has already started). a truncated ipsec packet is a valid ethernet frame (at least 64b) shorter than: a. esp ? at least 40 bytes following the ip header (16 [esp header] + 4 [min. padding, pad_len, nh] + 16 [icv] + 4 [crc]) b. ah over ipv4 ? at least 40 bytes following the ip header (20 [ah header] + 16 [icv] + 4 [crc]) c. ah over ipv6 ? at least 44 bytes following the ip header (20 [ah header] + 4 [icv padding] + 16 [icv] + 4 [crc])
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 389 8. compute icv and decrypt data (if required for esp) over the appropriate fields according to the operating rules described in section 7.9.2.12 , and making use of the aes-128 key and salt field fetched at step 5. 9. compare the computed icv with the icv field included in the packet at its appropriate location and report the comparison status match/fail via the security error bits of the extended errors field in the advanced rx descriptor. 7.9.2.11 rx hardware flow with tcp/udp checksum offload perform the rx hardware flow described in section 7.9.2.10 above and add the following steps: 1. start computing the checksum from the tcp/udp header?s beginning - found according to the rx parser logic updated for ipsec formats. 2. for esp, stop checksum computing before the beginning of the esp trailer - found from the end of packet according to the padding length field content, and to the formats. like for the usual case, implicitly pad out the data by one zeroed byte if its length is an odd number. 3. store the next header extracted from the ah header/esp trailer into the packet type field of the advanced rx descriptor, but use the tcp/udp protocol value in the ip pseudo header used for the tcp/udp checksum. also compute the tcp/udp packet length to be inserted in the ip pseudo header (excluding any ipsec header or trailer). 4. compare the computed checksum value with the tcp/udp checksum included in the packet. report the comparison status in the extended errors field of the advanced rx descriptor. 7.9.2.12 aes-128 operation in rx the aes-128 operation in rx is similar to the operation in tx, while for decrypting the encrypted payload is fed into the plaintext input, and the resulted ciphertext stands for the decrypted payload. refer to section 7.9.2.7 for the proper inputs to use in every ipsec mode. 7.9.2.13 handling ipsec packets in rx the following table summarizes how ipsec packets are handled according to some of their characteristics. table 7-64. summary of ipsec packets handling in rx ip fragment ipv4 option or ipv6 extensions or snap ip version sa match ipsec offload in hw layer4/3 offload in hw header split ah/esp reported in rx desc. yes yes v4 don?t care no ip checksum only up to ipsec header excluded yes yes yes v6 don?t care no no up to ip fragment extension included no yes no v4 don?t care no ip checksum only up to ipsec header excluded yes no yes v4 don?t care no ip checksum only no yes no yes v6 don?t care no no up to first unknown or ipsec extension header, excluded no 1
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 390 7.10 virtualization 7.10.1 overview i/o virtualization is a mechanism to share i/o resources among several consumers. for example, in a virtual system, multiple operating system are loaded and each executes as though the whole system's resources were at its disposal. however, for the limited number of i/o devices, this presents a problem because each operating system might be in a separate memory domain and all the data movement and device management has to be done by a vmm (virtual machine monitor). vmm access adds latency and delay to i/o accesses and degrades i/o performance. virtualized devices are designed to reduce the burden of vmm by making certain functions of an i/o device shared and thus can be accessed directly from each guest operating system or virtual machine (vm). the 82576 supports two modes of operations of virtualized environments: 1. direct assignment of part of the port resources to different guest oses using the pci sig sr iov standard. also known as ?native mode? or pass through mode. this mode is referenced as iov mode through this chapter 2. central management of the networking resources by an iovm or by the vmm. also known as software switch acceleration mode. this mode is referenced as next generation vmdq mode. in a virtualized environment, the 82576 serves up to 8 virtual machines (vms). the virtualization off loads capabilities provided by the 82576 apart from the replication of functions defined in the pci-sig iov spec are also part of next generation vmdq. an hybrid model, where part of the virtual machines are assigned a dedicated share of the port and the other ones are serviced by an iovm should also be supported. however, in this case the offloads provided to the software switch might be more limited. this model can be used when parts of the vms no no v4 yes yes yes 2 up to tcp/udp/sctp 3 header included, no split otherwise yes no no v4 no no ip checksum only no yes no no v6 yes yes yes 4 up to tcp/udp/sctp 5 header included, no split otherwise yes no no v6 no no no no yes 1. exception to snap ipsec packets that will be reported as ah/esp in rx descriptor 2. no layer4 offload done on packets with ipsec errors 3. no split is done for esp packets w/ sctp as layer4 protocol 4. no layer4 offload done on esp packets with icv error 5. no split is done for esp packets w/ sctp as layer4 protocol table 7-64. summary of ipsec packets handling in rx (continued) ip fragment ipv4 option or ipv6 extensions or snap ip version sa match ipsec offload in hw layer4/3 offload in hw header split ah/esp reported in rx desc.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 391 runs oss for which vf drivers are available and thus can benefit from iov and others runs older oses for which vf drivers are not available and are serviced by an iovm. in this case, the iovm is assigned one vf and receives all the packets with mac addresses of the vms behind it. the following section describes the support the 82576 provides for these modes. this chapter assumes a single root implementation of iov and no support for multi root. 7.10.1.1 direct assignment model the direct assignment support in the 82576 is built according to the following model of the software environment. it is assumed that one of the software drivers sharing the port hardware behaves as a master driver (physical function or pf driver). this driver is responsible for the initialization and the handling of the common resources of the port. all the other drivers might read part of the status of the common parts but can not change it. the pf driver might run either in the vmm or in some service operating system. it might be part of an iovm or part of a dedicated service operating system. in addition, part of the non time critical tasks are also handled by the pf driver. for example, access to csr through the i/o space or access to the configuration space are available only through the master interface. time critical csr space like control of the tx and rx queue or interrupt handling is replicated per vf, and directly accessible by the vf driver. note: in some systems with a thick hypervisor, the service operating system might be an integral part of the vmm - for these systems, each reference to the service operating system in the document below refers to the vmm. 7.10.1.1.1 rationale direct assignment purpose is to enable each of the virtual machines to receive and transmit packets with minimum of overhead. the non time critical operations such as init and error handling can be done via the pf driver. in addition, it is important that the vms can operate independently with minimal disturbance. it is also preferable that the vm interface to the hardware should be as close as possible to the native interface in non virtualized systems in order to minimize the software development effort. the main time critical operations that require direct handling by the vm are: 1. maintenance of the data buffers and descriptor rings in host memory. in order to support this, the dma accesses of the queues associated to a vm should be identified as such on the pcie using a different requester id. 2. handling of the hardware ring (tail bump and head updates) 3. interrupts handling. the capabilities needed to provide independence between vms are: 1. per vm reset and enable capabilities. 2. tx rate control 3. allocation of separate csr space per vm. this csr space is organized as close as possible to the regular csr space to allow sharing of the base driver code. note: the rate control and vf enable capabilities are controlled by the pf.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 392 7.10.1.2 system overview the following drawings describe the various elements involved in the i/o process in a virtualized system. figure 7-24 describes the flow in a software next generation vmdq mode and figure 7-26 the flow in iov mode. this document assumes that in iov mode, the driver on the guest operating system is aware that it works in a virtual system (para-virtualized) and there is a channel between each of the virtual machine drivers and the pf driver allowing message passing such as configuration request or interrupt messages. this channel might use the mailbox system implemented in the 82576 or any other mean provided by the vmm vendor.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 393 figure 7-25. iovm system
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 394 figure 7-26. sr-iov based system
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 395 7.10.1.3 vmdq1 versus next generation vmdq the 82576 supports additional virtualization features not supported in 82575. the following table compares the 82576 virtualization features (next generation vmdq) with 82575?s virtualization features (vmdq1) 7.10.2 pci sig sr-iov support 7.10.2.1 sr-iov concepts iov defines the following entities in relation to i/o virtualization: 1. virtual image (vi): a virtual machine to which a part of the i/o resources is assigned. also known as a vm. 2. i/o virtual intermediary (iovi) or i/o virtual machine (iovm): a special virtual machine that owns the physical device and is responsible for the configuration of the physical device. 3. end point (ep): the physical device that might contain a few physical functions - in our case, the 82576. 4. physical function (pf): a function representing a physical instance - in our case, one port. the pf driver is responsible for the configuration and management of the shared resources in the function. 5. virtual function (vf): a part of a pf assigned to a vi. 7.10.2.2 config space replication the sr-iov spec defines a reduced configuration space for the virtual functions. most of the pcie configuration of the vfs is inherited from the pf. table 7-65. vmdq1 versus next generation vmdq feature vmdq1 next generation vmdq queues 4 16 pools 2/4 8 mac addresses 16 24 queuing to pool method sa or vlan sa or vlan or (sa and vlan) rss in pool redirection table per pool. common redirection table - enable per pool. vm to vm switching no yes broadcast and multicast replication no yes mac and vlan anti spoof protection no yes drop if no pool no yes per pool statistics no yes per pool offloads partial yes mirroring no yes long packet filtering global global and per pool storm control no yes
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 396 this section describes the expected handling of the different parts of the configuration space for virtual functions. it deals only with the parts relevant to the 82576. details of the configuration space for virtual functions can be found in section 9.7 . 7.10.2.2.1 legacy pci config space the legacy configuration space is allocated to the pf only and emulated for the vfs. a separate set of bars and one bus master enable bit is allocated in the sr-iov capability structure in the pf and is used to define the address space used by the whole set of vfs. all the legacy error reporting bits are emulated for the vf. see section 7.10.2.4 for details. 7.10.2.2.2 memory bars assignment the sr- iov spec defines a fixed stride for all the vf bars, so that each vf can be allocated part of the memory bars at a fixed stride from the a basic set of bars. in this method only two decoders per replicated bar per pf are required and the bars reflected to the vf are emulated by the vmm the only bars that are useful for the vfs are bar0 & bar3, thus only those are replicated. the following table describes the existing bars and the stride used for the vfs: bar0 of the vfs are a sparse version of the original pf bar and includes only the register relevant to the vf. for more details see section 8.26 . the following figure describes the different bars in an iov enabled system: table 7-66. vf bars in the 82576 bar type usage requested size per vf 0 mem csr space max (16k, page size) 1 mem high word of csr space address n/a 2 n/a not used n/a 3 mem msi-x max (16k, page size) 4 mem high word of msi-x space address n/a 5 n/a not used n/a
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 397 7.10.2.2.3 pcie capability structure 7.10.2.2.4 pci-express capability structure the pcie capability structure is shared between the pf and the vfs. the only relevant bits that are replicated are: 1. transaction pending 2. function level reset. see section 7.10.2.3 for details. 7.10.2.2.5 msi and msi-x capabilities figure 7-27. memory bar allocation to vfs
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 398 both msi & msi-x are implemented in the pf in the 82576. msi-x vectors can be assigned per vf. msi is not supported for the vfs. see section 9.7 for more details of the msi-x & pba tables implementation. 7.10.2.2.6 vpd capability vpd is implemented only once and is accessible only from the pf. 7.10.2.2.7 power management capability pci sig sr-iov specification makes vf power management optional. the 82576 does not support power management in vfs. the power management registers are implemented in the vfs but supports only d0 state. power management is emulated for the vfs by the iovm and is implemented only for the pf. 7.10.2.2.8 serial id same serial id is reported to all vfs in the device. 7.10.2.2.9 error reporting capabilities (advanced & legacy) all the bits in this capability structure is implemented only for the pf. the vms see an emulated version of this. see section 7.10.2.4 for details. 7.10.2.3 function level reset (flr) capability the flr bit is required per vf. setting of this bit resets only the part of the logic dedicated to the specific vf and do not influence the shared part of the port. this reset should disable the queues, disable interrupts and stop receive and transmit process per vf. setting the pf flr bit resets the entire function. 7.10.2.4 error reporting error reporting includes baseline error reporting and aer (advanced error reporting or role based) capability. the baseline error management includes the following functions: 1. error capabilities enablement. these are set by the pf for all the vfs. narrower error reporting for a given vm can be achieved by filtering of the errors by the vmm. 2. error status in the config space. these should be set separately for each vf. a. see section 9.7.1 for details about vf specific error reporting registers. aer capability includes the following functions: 1. error capabilities enablement. the error mask, and severity bits are set by the pf for all the vfs. narrower error reporting for a given vm can be achieved by filtering of the errors by the vmm. 2. non-function specific errors status in the config space. a. non-function specific errors are logged in the pf
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 399 b. error logged in one register only c. vmm avoids touching all vfs to clear device level errors 3. function specific errors status in the config space. a. function specific errors are logged in the vf. b. allows per vf error detection and logging. c. help with fault isolation. d. see section 9.7.2.3 for details about vf specific aer registers. 4. error logging. each vf has it?s own header log. 5. error messages. in order to ease the detection of the source of the error, the error messages should be emitted using the requester id of the vf in which the error occurred. 7.10.2.5 ari & iov capability structures in order to allow more than 8 functions per end point without requesting an internal switch, as usually needed in virtualization scenarios, the pci sig defines the alternative routing id (ari) capability structure. this is a new capability that allows an interpretation of the device & function fields as a single identification of a function within the bus. in addition a new structure used to support the iov capabilities reporting and control is defined. both structures are described in sections 9.6.3 & 9.6.4. see next section for details on the rid allocation to vfs. 7.10.2.6 requester id allocation the requester id allocation of the vf is done using the offset field in the iov structure. this field should be replicated per vf and is used to do the enumeration of the vfs. each pf includes an offset to the first associated vf. this pointer is a relative offset to the bdf of the first vf. the offset field is added to pf?s requester id to determine requester id of the next vf. an additional field in the iov capability structure describes the distance between two consecutive vf?s requester id. 7.10.2.6.1 bus-device-function layout 7.10.2.6.1.1 ari mode the ari proposal allows interpretation of the device id part of the rid as part of the function id inside a device. thus a single device can span up to 256 functions. in order to ease the decoding, the least significant bit of the function number points to the physical port number. the next bits indicates the vf number. the following table describes the vf requester ids. table 7-67. rid per vf ? ari mode port vf# b,d,f binary notes 0 pf b,0,0 b,00000,000 pf 1 pf b,0,1 b,00000,001 pf 0 0 b,16,0 b,10000,000 offset to first vf from pf is 128. 1 0 b,16,1 b,10000,001 0 1 b,16,2 b,10000,010
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 400 7.10.2.6.1.2 non ari mode when ari is disabled, non zero devices in the first bus can not be used, thus a second bus is needed to provide enough requester ids. in this mode, the rid layout is as follows: note: when the device id of a physical function changes (because of lan disable or lan function sel settings), the vf device ids changes accordingly. 7.10.2.7 hardware resources assignment the main resources to allocate per vm are queues and interrupts. the assignment is a static one. if a vm requires more resources, it might be allocated more than one vf. in this case, each vf gets a specific mac address/vlan tag in order to allow forwarding of incoming traffic. the two vfs are then teamed in software. 7.10.2.7.1 physical function resources a possible use of the physical function is for configuration setting without transmit and receive capabilities. in this case it is not allocated any queues and is allocated one msi-x vector. 1 1 b,16,3 b,10000,011 0 2 b,16,4 b,10000,100 1 2 b,16,5 b,10000,101 ... 0 7 b,17,6 b,10001,110 1 7 b,17,7 b,10001,111 last table 7-68. rid per vf ? non ari mode port vf# b,d,f binary notes 0 pf b,0,0 b,00000,000 pf 1 pf b,0,1 b,00000,001 pf 0 0 b+1,16,0 b+1,10000,000 offset to first vf from pf is 384. 1 0 b+1,16,1 b+1,10000,001 0 1 b+1,16,2 b+1,10000,010 1 1 b+1,16,3 b+1,10000,011 0 2 b+1,16,4 b+1,10000,100 1 2 b+1,16,5 b+1,10000,101 ... 0 7 b+1,17,6 b+1,10001,110 1 7 b+1,17,7 b+1,10001,111 last table 7-67. rid per vf ? ari mode (continued) port vf# b,d,f binary notes
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 401 physical function have access to all the resources of all the virtual machines but it is not expected to make use of resources allocated to active virtual functions. 7.10.2.7.2 resource summary the 82576 supports 8 vms, 2 queues pairs (tx/rx) per vm and 3 msi-x vectors per vm. 7.10.2.8 csr organization the csr of the nic can be divided to three types: 1. global configuration registers that should be accessible only to the pf (such as link control, led control, etc.). this type of registers includes also all the debug features such as the mapping of the packet buffers and is responsible for most of the csr area requested by the nic. this includes per vf configuration parameters that can be set by the pf without performance impact. 2. per queue parameters that should be replicated per queue (head, tail, rx buffer size, and dca tag). these parameters are used both by a vf in an iov system and by the pf in a non iov mode. 3. per vf parameters (per vf reset) interrupt enable. multiple instances of these parameters are used only in an iov system and only one instance is needed for non iov systems. in order to support iov without distributing the current drivers operation in legacy mode, the following method is used: 1. the pf instance of bar0 continues to contain the legacy and control registers. it is accessible only to the pf. the bar allows access to all the resources including the vf queues and other vf parameters. however it is expected that the pf driver does not access these queues in iov mode. 2. the vf instances of bar0 provides the control on the vf specific registers. these bars have the same mapping as the original bar0 with the following exceptions: a. fields related to the shared resources are reserved. b. the 2 queues of the vf are mapped at the same location as the 2 first queues of the pf. 3. assuming some backward compatibility is needed for iov drivers, the pf/vf parameters block should contain a partial register set as described in section 8.26 . 7.10.2.9 iov control in order to control the iov operation, the physical driver is provided with a set of registers. these includes: 1. the mailbox mechanism described below. 2. the switch and filtering control registers described in section 7.10.3.12 . 3. vflre: register indicating that a vflr reset occurred in one of the vfs (bitmap). 4. vfte: enables tx traffic per vf. a vf tx is disabled by an flr to this vf until the pf enables it again. this allows the pf to block transmit process until the configurations for this vm are done. 5. vfre: enables rx filtering per vf. a vf rx is disabled by an flr to this vf until the pf enables it again. this allows the pf to block the receive process until the configurations for this vm are done. 7.10.2.9.1 vf to pf mailbox
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 402 the vf drivers and the pf driver requires some mean of communication between them. this channel can be used for the pf driver to send status updates to the vfs (link change, memory parity error, etc.) or for the vf to send requests to the pf (add to vlan). such a channel can be implemented in software, but it requires enablement by the vmm vendors. in order to avoid the need for such an enablement, the 82576 provides such a channel that allows direct communication between the two drivers. the channel consists of a mailbox similar to the host interface currently defined between the software and the manageability firmware. each driver can then receive and indication (either poll or interrupt) when the other side wrote a message. assuming a max message size of 64 bytes (one cache line), ram of 64 bytes x 8 vms= 0.5 kbyte is provided. table 7-69 shows how ram is organized. in addition for each vf, the vfmailbox & pfmailbox registers are defined in order to coordinate the transmission of the messages. these registers contains a semaphore mechanism to allow coordination of the mailbox usage. the pf driver can decide which vfs are allowed to interrupt the pf to indicate a mailbox message using the mbvfimr mask register. the following flows describes the usage of the mailbox: table 7-69. mailbox memory ram address function pf bar 0 mapping 1 1. relative to mailbox offset vf bar 0 mapping 2 2. mbo = mailbox offset in vf csr space 0 - 63 vf0 ? pf 0 - 63 vf0 + mbo 64 - 127 vf1 ? pf 64 - 127 vf1 + mbo .... 448 - 512 vf7 ? pf 448 - 512 vf7 + mbo table 7-70. pf to vf messaging flow step pf driver hardware vf #n driver 1 set pfmailbox[n].pfu 2 set pfu bit if pfmailbox[n].vfu is cleared 3 read pfmailbox[n] and check that pfu bit was set. otherwise wait and go to step 1 4 write message to relevant location in vmbmem 5 set the pfmailbox[n].sts bit and wait for ack 1 . 6 indicate an interrupt to vf #n 7 read the message from vmbmem
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 403 the content of the message is hardware independent and can be fixed by the software. the messages currently assumed by this specification are: 1. registration to vlan/multicast packet/broadcast packets - a vf can request to be part of a given vlan or to get some multicast/broadcast traffic. 2. reception of large packet - each vf should notify the pf driver what is the largest packet size allowed in receive. 3. get global statistics - a vf can request information from the pf driver on the global statistics. 4. filter allocation request - a vf can request allocation of a filter for queuing/immediate interrupt support. 5. global interrupt indication. 6. indication of errors. 8 set the vfmailbox.ack bit 9 indicate an interrupt to pf 10 clear pfmailbox[n].pfu 1. the pf might implement a timeout mechanism to detect non responsive vfs. table 7-71. vf to pf messaging flow step pf driver hardware vf #n driver 1 set vfmailbox.vfu 2 set vfu bit if vfmailbox[n].pfu is cleared 3 read vfmailbox[n] and check that vfu bit was set. otherwise wait and go to step 1 4 write message to relevant location in vmbmem 5 set the vfmailbox.req bit 6 indicate an interrupt to pf via icr.vmmb 7 read mbvficr to detect which vf caused the interrupt 8 read the adequate message from vmbmem 9 set the pfmailbox.ack bit 10 indicate an interrupt to vf #n 11 clear vfmailbox.vfu table 7-70. pf to vf messaging flow (continued) step pf driver hardware vf #n driver
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 404 7.10.2.10 interrupt handling interrupts can be separated into two types: 1. interrupts relevant to the behavior of each vm, including rx & tx packet sent indications, mailbox message and device status indications. 2. interrupts relevant only to the handling of the shared resources. these are mainly error indications - such as packet buffer full and parity errors. the first type of interrupts should be provided directly to the vm driver and the second type can be handled by the pf driver. interrupt control in the vf uses the same mechanism as the in the non virtualized case. the cause bits are independent and each vf can clear its own cause bits independently. the following registers are added per vf: 1. vteicr, vteics, vteims, vteimc, vteiac, vteiam with the following fields: a. rtxq[1:0] b. mailbox 2. vteitr0,1,2. 3. vtivar & vtivar_misc for mailbox. 7.10.2.10.1 low latency interrupts low latency interrupts (lli) are described in section 7.3.5 . several packet types generate lli: ? a packet matching a 5-tuple filter assigned to a vf - each vf can require from the pf driver an lli for one of its flows. ? a packet matching a l2 ethertype filter - an lli is generated to specific vfs (based on the queue assignment) that handle control traffic. ? a packet matching a certain vlan priority - an lli is generated to the target vf based on the queue assignment for the rx packet an and condition on the vm number is added to the immediate interrupt decision in order to prevent a vm from requiring immediate interrupts for flows not owned by it and in order to allow a filter to apply only to a given vm. for example, assume a given vm would require immediate interrupts on packets with psh flag set. the vm number filtering prevents other vm from receiving immediate interrupts on such packets. 7.10.2.10.2 msi-x msi-x tables are in bar3. the msi-x vectors might be used either as one big set of vectors in non iov mode or as small sets allocated to vfs. in order to support both modes and save the need for duplication of the logic the first iov vectors should be mapped as non iov vectors also. the mapping of the vectors in iov mode is described in section 8.26.3 the pba vector is replicated for the iov case, as the saving in area is low and different per bit encoding is complicated. 7.10.2.10.3 msi msi implementation is optional in the iov spec. the 82576 doesn?t support msi in virtual functions.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 405 7.10.2.10.4 legacy interrupt (int-x) legacy interrupts are not supported in iov mode. 7.10.2.11 dma 7.10.2.11.1 requester id each vf is allocated a requester id. each dma request should use the rid of the vm that requested it. see section 7.10.2.6 for details. 7.10.2.11.2 sharing dma resources the outstanding requests an completion credits are shared between all the vfs. the tags attached to read requests are assigned the same way they are today, although in vf systems tags can be re-used for different requester ids. 7.10.2.11.3 dca the dca enable is common to all the devices (all pfs & vfs). given a dca enabled device, each vm might decide for each queue, on which type of traffic (data, headers, tx descriptors, rx descriptors) dca should be asserted and what is the cpu id assigned to this queue. note: there are no plans to virtualize dca in the ioh. thus the physical cpu id should be used in the programming of the cpuid field. 7.10.2.12 timers and watchdog 7.10.2.12.1 tcp timer the tcp timer is available only to the pf. it might indicate an interrupt to the vfs via the mailbox mechanism. 7.10.2.12.2 ieee 1588 ieee 1588 is a per link function and thus is controlled by the pf driver. the vms have access to the real time clock register. 7.10.2.12.3 watchdog. the watchdog was originally developed for pass-through nics where virtualization is not an interesting use case. thus, this functionality is used only by the pf. 7.10.2.12.4 free running timer the free running timer is a pf driver resource the vms can access. this register is read only to all vf. it is reset only by the pci reset.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 406 7.10.2.13 power management and wakeup power management is a pf resource and is not supported per vf. 7.10.2.14 link control the link is a shared resource and as such is controllable only by the pf. this include phy settings, speed and duplex settings, flow control settings, etc. the flow control packets are sent with the station mac address stored in the eeprom. the watermarks of the flow control process and the time-out value are also controllable by the pf only. macsec is a per link function and thus is controlled by the pf driver. double vlan is a network setting and as such should be common to all vfs. 7.10.2.14.1 special filtering options pass bad packets is a debug feature. as such pass bad packet is available only to the pf. bad packets is passed according to the same filtering rules of the regular packets. as it might cause guest operating system s to get unexpected packets, it should be used only for debug purposes of the whole system. reception of long packet is controlled separately per vm. as this impact the flow control thresholds, the pf should be made aware of the decision of all the vms. because of this, the setup of the large send packets is centralized by the pf and each vf might request this setting. 7.10.2.14.2 allocation of memory space for iov functions if the bios didn?t allocate memory for the iov functions, the following flow may be used to allocate memory to the 82576 virtual function: 1. in the eeprom request some space for the serial flash bar. this space should be large enough to cover the iov vf memory space needs. for example, assuming the memory page size is 4k and 8 vfs are enabled, then 256 kbytes of ram should be requested (16 k for the csr bar, 16 k for the msi-x bar by 8 functions). 2. before enabling iov, zero the flash bar and program the iov bars to use the old flash bar. the vfs csr bar may use the first half of the original flash memory and the msi-x bar may use the second half. 7.10.3 packet switching 7.10.3.1 assumptions the following assumption are made: 1. the required bandwidth for the vm to vm loopback traffic is low. for example, the pcie bw is not congested by the combination of the vm to vm and the regular incoming traffic. this case is handled but not optimized for. unless specified otherwise, tx and rx packets should not be dropped or lost due to congestion caused by loopback traffic. 2. most of the offloads provided on rx traffic are not provided for the vm to vm loopback traffic. 3. if the buffer allocated for the vm to vm loopback traffic is full, it is ok to back pressure the transmit traffic. this mean that the outgoing traffic might be blocked if the loopback traffic is congested.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 407 4. the decision on vm to vm loopback traffic is done only according to the ethernet da address and the vlan tag, support of cts tbd . there is no filtering according to other parameters (ip, l4, etc.). this switch have no learning capabilities. 5. the forwarding decisions are based on the receive filtering programming. 6. when the link is down or during flow control events, the tx flow is stopped, and thus the local switching traffic is stopped also. 7.10.3.2 vf selection the vf selection is done by mac address and vlan tag. broadcast & multicast packets are forwarded according to the individual setting of each vf and might be replicated to multiple vfs. 7.10.3.2.1 filtering capabilities the following capabilities exists in to decide what is the final destination of each packet in addition to the regular l2 filtering capabilities: ? 24 mac addresses filters (rah/ral registers) for both unicast and multicast filtering. these are shared with l2 filtering. for example, the same mac addresses are used to determine if a packet is received by the switch and to determine the forwarding destination. ? 32 shared vlan filters (vlvf registers) - each vm can be made member of each vlan. ? multicast exact filtering using the existing remaining rah/ral registers otherwise an imperfect multicast table shared between vfs. ? 256 hash filtering of multicast addresses shared between the vfs (mta table). ? promiscuous multicast & enable broadcast per vf. note: packets for which no queueing decision was done and still accepted by the l2 filtering, is directed to the queue pool of the default vf or dropped. 7.10.3.3 l2 filtering l2 filtering is the 1st stage in 3 stages that determine the destination of a received packet. the 3 stages are defined in section 7.1.1 . all received packets passes the same filtering as in the non virtualized case; regular vlan filtering using the global vlan table (vta) of the pf and filtering according to the rah/ral registers and according to the various promiscuous bits. note: every vlan tag set in the vlvf registers should be asserted also in the vta table. the rctl.upe bit (promiscuous unicast) is not available per vf and might be modified only by the pf driver. 7.10.3.4 size filtering a packet is defined as undersize if it is smaller than 64 bytes. a packet is defined as oversize in the following conditions: ? the rctl.lpe bit cleared and one of the following conditions is met: ? the packet is bigger than 1518 bytes and there are no vlan tags in the packet. ? the packet is bigger than 1522 bytes and there is one vlan tag in the packet.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 408 ? the packet is bigger than 1526 bytes and there are two vlan tags in the packet. ? the rctl.lpe bit cleared and the packet is bigger than rlpml.rlpml bytes. 7.10.3.5 rx packets switching rx packet switching is the 2nd stage in 3 stages that determine the destination of a received packet. the 3 stages are defined in section 7.1.1 . as far as switching is concerned, it doesn?t matter whether our virtual environment operates in iov mode or in next generation vmdq mode. the vf is identified by the ?pool list? as described in section 7.10.3.5.1 and section 7.10.3.5.2 . when working in a virtualized environment, a single rx queue can still be determined by the ethertype filters. if these filters don?t match, then a pool list should be found. then the switch should determine which of the 2 queues of the targeted vms is addressed.this 3rd stage that determines the queue in the pool is described in section 7.1.1.2 . when working in replication mode, broadcast and multicast packets can be forwarded to ore than one vm, and is replicated to more than one rx queue. replication is enabled by the rpl_en bit in the vt_ctl register. in virtualization modes, the pool list is a list of one or more vms to which the packet should be forwarded. the pool list is used in choosing the target queue list except for cases in which high priority filters with take precedence. there is a difference in the way the pool list is found when replication mode is enabled or disabled. 7.10.3.5.1 replication mode enabled when replication mode is enabled, each broadcast/multicast packet can go to more than one pool. finding the pool list should be done according to the following steps: 1. exact unicast or multicast match ? if there is a match in one of the exact filters (ral/rah), for unicast or multicast packets, take the rah.poolsel[7:0] field as a candidate for the pool list. 2. broadcast ? if the packet is a broadcast packet, add pools for which their vmolr.bam bit (broadcast accept mode) is set. 3. unicast hash ? if the packet is a unicast packet, and the prior steps yielded no pools, check it against the unicast hash table (uta). if there is a match, add pools for which their vmolr.rope bit (receive overflow packet enable) is set. 4. multicast hash ? if the packet is a multicast packet and the prior steps yielded no pools, check it against the multicast hash table (mta). if there is a match, add pools for which their vmolr.rompe bit (receive multicast packet enable) is set. 5. multicast promiscuous ? if the packet is a multicast packet, take the candidate list from prior steps and add pools for which their vmolr.mpe bit (multicast promiscuous enable) is set. 6. ignore mac (vlan only filtering) ? if vt_ctl.igmac bit is set, then the previous steps are ignored and a full pool list is assumed for the next step. 7. vlan groups ? this step is relevant only if the rctl.vfe bit is set, otherwise it is skipped. packets should be sent only to vms that belong to the packet?s vlan group. a. tagged packets: enable only pools in the packet?s vlan group as defined by the vlan filters - vlvf[n].vlan_id and their pool list ? vlvf[n].poolsel[7:0] . b. untagged packets: enable only pools with their vmolr.aupe bit set c. if there is no match, the pool list should be empty.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 409 note: in a vlan network, untagged packets are not expected. such packets received by the switch should be dropped, unless their destination is a virtual port set to receive these packets. the setting is done through the vmolr.aupe bit. it is assumed that vms for which this bit is set are members of a default vlan and thus only mac queuing is done on these packets. 8. default pool ? if the pool list is empty at this stage and the vt_ctl.dis_def_pool bit is not set, then set the default pool bit in the target pool list (from vt_ctl.def_pl). 9. ethertype filters ? if the one of the ethertype filters (etqf) is matched by the packet and queuing action is requested, the vm list is set to the pool pointed by the filter. 10. vfre ? if any bit in the vfre register is cleared, clear the respective bit in the pool list. note: the vfre filtering is applied only after the decision to forward the packet to network and/or local pool (based on mac address and vlan). if a packet that matches an exact mac address is set to be forwarded to a local pool, it is not sent to the network regardless of the vfre setting. therefore, when a pool is disabled, the software should also clear its exact mac address filters before clearing the vfre. 11. length limit ? if the packet is longer than a legal ethernet packet, remove from the pool list all the pools for which the vmolr.lpe bit is not set or for which the packet length is larger than the value in the vmolr.rlpml field. 12. mirroring ? for each of the 4 mirroring rules add the destination (mirroring) pool (vmrctl.mp) to the pool list according to the following rules: a. pool mirroring ? if vmrctl.vpme is set and one of the bits in the pool list matches one of the bits in the vmrvm register. b. vlan port mirroring ? if vmrctl.vlme is set and the index of the vlan of the packet in the vlvf table matches one of the bits in the vmrvlan register. c. uplink port mirroring ? if vmrctl.upme is set and the pool list is not empty and the packet came from the lan. d. downlink port mirroring ? if vmrctl.dpme is set and the packet came from the host and is transmitted to the network (relevant only for vm to vm traffic). this means that, when this bit is set, transmit traffic is mirrored to the mirrored port. there is no mirroring to the network e. vfre ? if any bit in the vfre register is cleared, clear the respective bit in the pool list. 13. length limit ? if the packet is longer than a legal ethernet packet, remove from the pool list all the pools for which the vmolr.lpe bit is not set or for which the packet length is larger than the value in the vmolr.rlpml field. the above process, up to stage 9. can be logically described by the following scheme:
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 410 7.10.3.5.2 replication mode disabled when replication mode is disabled, the software should take care of multicast and broadcast packets and check which of the vms should get them. in this mode the pool list always contains one pool only according to the following steps: 1. exact unicast or multicast match ? if the packet da matches one of the exact filters (ral/rah), take the rah.poolsel[7:0] field as a candidate for the pool list. 2. ignore mac (vlan only filtering) ? if vt_ctl.igmac bit is set, then the previous steps are ignored and a full pool list is assumed for the next step. figure 7-28. pool list selection ? replication enabled
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 411 3. unicast hash ? if the packet is a unicast packet, and the prior steps yielded no pools, check it against the unicast hash table (uta). if there is a match, add the pool for which the vmolr.rope bit (receive overflow packet enable) is set. (see software limitation no 3. below). 4. vlan groups ? this step is relevant only if the rctl.vfe bit is set, otherwise it is skipped. packets should be sent only to vms that belong to the packet?s vlan group. a. tagged packets: enable only pools in the packet?s vlan group as defined by the vlan filters - vlvf[n].vlan_id and their pool list - vlvf[n].poolsel[7:0] . b. untagged packets: enable only pools with their vmolr.aupe bit set c. if there is no match, the pool list should be empty. 5. default pool- if the packet is a unicast packet and no pool was chosen and the vt_ctl.dis_def_pool bit is not set, then set the default pool bit in the pool list (from vt_ctl.def_pl). 6. broadcast or multicast ? if the packet is a multicast or broadcast packet and was not forwarded in step 1 set the default pool bit in the pool list (from vt_ctl.def_pl). 7. length limit ? if the packet is longer than a legal ethernet packet, remove from the pool list all the pools for which the vmolr.lpe bit is not set or for which the packet length is larger than the value in the vmolr.rlpml field. 8. vfre ? if any bit in the vfre register is cleared, clear the respective bit in the pool list. note: the vfre filtering is applied only after the decision to forward the packet to network and/or local pool (based on mac address and vlan). if a packet that matches an exact mac address is set to be forwarded to a local pool, it is not sent to the network regardless of the vfre setting. therefore, when a pool is disabled, the software should also clear its exact mac address filters before clearing the vfre. the above process can be logically described by the following scheme:
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 412 the following limitations applies when replication is disabled: 1. it is the software responsibility must not set more than one bit in the bitmaps of the exact filters. note that multiple bits might be set in an rah register as long as it is guaranteed that the packet is sent to only one queue by other means (vlan or cts) 2. the software must not set per-vm promiscuous bits (multicast or broadcast). 3. the software must not set the rope bit in more than one vmolr register. 4. if vt_ctl.igmac bit is set, the software should must not set the vmolr.aupe in more than one vmolr register and must not set more than one bit in each of the vlvf.poolsel bitmaps. 5. the software should not activate mirroring. 6. the software should take care not to set the rope bit in more than one vmolr register. 7.10.3.6 tx packets switching tx switching is used only in a virtualized environment to serve vm to vm traffic. packets that are destined to one or more local vms, are loop backed to the rx path through a separate packet buffer. enabling tx switching is done by setting the dtxswc.loopback_en bit. tx switching rules are very similar to rx switching in a virtualized environment, with the following exceptions and rules: ? there high priority filters (etype/syn/5-tuple) are not applied to the tx traffic. figure 7-29. pool list selection ? replication disabled
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 413 ? if a target pool is not found, the default pool is not used, and the packet might only go to the external lan. ? rss is not used for queue selection inside a vm. ? a unicast packet that matches an exact filter is not sent to the lan. ? broadcast and multicast packets are always sent to the external lan too, unless member of a local vlan. ? if an outgoing packets vlan matches a vlvf entry with the lvlan bit set, this packet is not sent to the external lan. this rule overrides previous rules. ? a packet might not be sent back to the originating vm (even if the destination address is equal to the source address). however, in order to off-load a software switch allowing multiple vms sharing the same pool or for vf loopback diagnostics, the 82576 provides the capability to loopback packets inside a pool. in the normal case, a packet whose source and destination are the same is dropped (usually occurs with multicast packets). if the local loopback bit mode (lle) in dtxswc is set for this pool, packets originating from a given pool can be sent to the same pool.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 414 the following rules apply to loopback traffic: ? loopback is disabled when the network link is disconnected. it is expected (but not required) that system software (including virtual machines) does not post packets for transmission when the link is disconnected. ? loopback is disabled when the receive enable (rxen) bit is cleared. ? loopback packets are identified by the lb bit in the receive descriptor. 7.10.3.6.1 replication mode enabled when replication mode is enabled, the pool list for tx packets is determined according to the following steps: figure 7-30. tx filtering
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 415 1. exact unicast or multicast match ? if there is a match in one of the exact filters (ral/rah), for unicast or multicast packets, take the rah.plsel field as a candidate for the pool list. 2. broadcast ? if the packet is a broadcast packet, add pools for which their vmolr.bam bit (broadcast accept mode) is set. 3. unicast hash ? if the packet is a unicast packet, and the prior steps yielded no pools, check it against the unicast hash table (uta). if there is a match, add pools for which their vmolr.rope bit (receive overflow packet enable) is set. 4. multicast hash ? if the packet is a multicast packet and the prior steps yielded no pools, check it against the multicast hash table (mta). if there is a match, add pools for which their vmolr.rompe bit (receive multicast packet enable) is set. 5. multicast promiscuous ? if the packet is a multicast packet, take the candidate list from prior steps and add pools for which their vmolr.mpe bit (multicast promiscuous enable) is set. 6. filter source port ? the pool from which the packet was sent is removed from the pool list unless the dtxswc.lle bit is set. 7. vlan groups ? this step is relevant only if the rctl.vfe bit is set, otherwise it is skipped. packets should be sent only to vms that belong to the packet?s vlan group. a. tagged packets: enable only pools in the packet?s vlan group as defined by the vlan filters - vlvf[n].vlan_id and their pool list - vlvf[n].poolsel[7:0] . b. untagged packets: enable only pools with their vmolr.aupe bit set c. if there is no match, the pool list should be empty. 8. forwarding to the network: a. all broadcast and multicast packets are sent to the network also. b. unicast packet that do not match any exact filter. note: for packets forwarded to the network, if the vlvf.lvlan is set, then the packet is not sent to the network. 9. vfre ? if any bit in the vfre register is cleared, clear the respective bit in the pool list. 10. length limit ? if the packet is longer than a legal ethernet packet, remove from the pool list all the pools for which the vmolr.lpe bit is not set or for which the packet length is larger than the value in the vmolr.rlpml field. 11. mirroring ? each of the 4 mirroring rules adds its destination pool (vmrctl.mp) to the pool list if the following applies: a. pool mirroring ? vmrctl.vpme is set and one of the bits in the pool list matches one of the bits in the vmrvm register. b. vlan port mirroring ? vmrctl.vlme is set and the index of the vlan of the packet in the vlvf table matches one of the bits in the vmvlan register. c. downlink port mirroring ? vmrctl.dpme is set and the packet is sent to the network. d. vfre ? if any bit in the vfre register is cleared, clear the respective bit in the pool list. 12. length limit ? if the packet is longer than a legal ethernet packet, remove from the pool list all the pools for which the vmolr.lpe bit is not set or for which the packet length is larger than the value in the vmolr.rlpml field. 7.10.3.6.2 replication mode disabled when replication mode is disabled, the software should take care of multicast and broadcast packets and check which of the vms should get them. in this mode the pool list for tx packets always contains at the most one pool according to the following steps:
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 416 1. exact unicast or multicast match ? if the packet da matches one of the exact filters (ral/rah), take the rah.plsel field as a candidate for the pool list. 2. unicast hash ? if the packet is a unicast packet, and the prior steps yielded no vms, check it against the unicast hash table (uta). if there is a match, add pools for which their vmolr.rope bit (receive overflow packet enable) is set. 3. filter source port ? the pool from which the packet was sent is removed from the pool list unless the dtxswc.lle bit is set. 4. forwarding to the network ? all broadcast and multicast packets are sent to the network also. unicast packet added to the pool list at step 3 (unicast hash) or for which the pool list is empty are forwarded to the network also. 5. ignore mac (vlan only filtering) ? if vt_ctl.igmac bit is set, then the previous steps are ignored and a full pool list is assumed for the next step. 6. vlan groups ? this step is relevant only if the rctl.vfe bit is set, otherwise it is skipped. packets should be sent only to vms that belong to the packet?s vlan group. a. tagged packets: enable only pools in the packet?s vlan group as defined by the vlan filters - vlvf[n].vlan_id and their pool list - vlvf[n].poolsel[7:0] . b. untagged packets: enable only pools with their vmolr.aupe bit set c. if there is no match, the pool list should be empty. 7. forwarding to the network: a. all broadcast and multicast packets are sent to the network also. b. unicast packet that do not match any exact filter. note: for packets forwarded to the network, if the vlvf.lvlan is set, then the packet is not sent to the network. 8. length limit: if the packet is longer than a legal ethernet packet, remove from the pool list all the pools for which the vmolr.lpe bit is not set or for which the packet length is larger than the value in the vmolr.rlpml field. 9. vfre ? if any bit in the vfre register is cleared, clear the respective bit in the pool list. the limitations listed in section 7.10.3.5.2 applies for tx traffic also. 7.10.3.7 mirroring support the 82576 supports 4 mirroring rules. each rule can be of one of 5 types. only egress mirroring is supported and not ingress. for example, the mirroring is done on the receive path and mirrored packets reflects all the changes that occurs to the received packet. mirroring is supported only to virtual ports and not to the uplink. mirroring should be activated only when one of the next generation vmdq queueing mode is used. the following types of rules are supported: 1. virtual port mirroring ? reflects all the packets sent to a set of given vms. 2. uplink port mirroring ? reflects all the traffic received from the network. 3. downlink port mirroring ? reflects all the traffic transmitted to the network. 4. receive mirroring ? reflects all the traffic received by any of the vms. either from the network or from local vms. this is supported by enabling mirroring of all vms. 5. vlan mirroring ? reflects all the traffic received in a set of given vlans. either from the network or from local vms.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 417 all the modes can be accumulated into a single rule. this new mirroring mode is controlled by a set of rule control registers: ? vmrctl ? controls the rules to be applied and the destination port. ? vmrvlan ? controls the vlan ports as listed in the vlvf table taking part in the vlan mirror rule. ? vmrvm ? controls the vms ports taking part of the virtual port mirror rule. mirroring is supported only when replication is enabled. the exact flow of mirroring is described in step 12. in section 7.10.3.5.1 . 7.10.3.8 offloads in case of packets directed to one vm only, the off loads are determined by this specific vm setting. however, the 82576 can not apply different off loads (vlan & crc strip + decision of size of header for split/replication offload) to different replication of the same packet. the following sections describes the rules used to decide which off loads to apply in case of replicated packets. if replication is disabled, the offloads are determined by the unique destination of the packet. note: in a virtualization environment (mrqc.multiple receive queues enable = 011b - 101b), the global vlan strip and crc strip bits (ctrl.vme & rctl.secrc) are ignored and the vf specific bits in vmolr & rplolr are used instead. vlan strip offload is determined based only on the l2 mac address. in order to make sure vlan strip offload is correctly applied, all packets should be initially forwarded using one of the l2 mac address filters (rah/ral, uta, mta, vmolr.bam, vmolr.mpe. 7.10.3.8.1 replication by exact mac address as mentioned above, the same mac address can be assigned to more than one vm. this is used for the following cases: ? multicast address ? in this case, the different vms might be part of the same vlan. the offloads applied to packets matching this address are defined in the replicated packets offloads register. ? unicast; same mac different vlan ? in this case, each vm should belong to different vlan(s). the applied offloads is according to the pool selected by the mac/vlan pair. one exception is the vlan strip decision which is done according to the first pool parameters. this means that all the pools sharing a mac address should use a common vlan strip policy. 7.10.3.8.2 replication by promiscuous modes a packet might be replicated to multiple vms because part of the vms are set to receive all multicast or broadcast packets or because of a packet matching one of the hash tables (uta or mta). the offloads applied to packet are defined in the replicated packets offloads registers. in case of unicast packet, the offloads is applied according the first of the pools selected to receive the packet. 7.10.3.8.3 replication by mirroring offloads of mirrored packets are determined according to the original pool.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 418 7.10.3.8.4 vlan only filtering if vt_ctl.igmac bit is set, the pool is defined according to the vlan only. in this mode, only a uniform vlan strip policy is supported. this mean the vmolr.strvlan bit should be set to the same value for all vfs. 7.10.3.8.5 local traffic offload most of the offloads available for regular incoming traffic is not available in case of vm to vm traffic. the driver might handle the lack of the offloads, as follows: 1. vlan strip ? a loop back tagged packet is always received by the destination vf with the vlan tag stripped. the vp & vpkt bits in the receive descriptor indicates this conditions. the vlan tag is received in the descriptor. 2. checksum ? the transmit path always adds a checksum - either by the driver of by the 82576, but this checksum is not validated by the receive path. as this packet wasn?t sent over the network, the receive side might assume the tcp and ip checksum are valid. 3. packet types identification ? the l3 packet type identification is provided only if at least one of the following offload is requested for the transmitted packet: ip checksum, l4 checksum or ipsec offload. the l4 packet type identification is provided only if l4 checksum is requested for the transmitted packet. a packet might be identified as ipv4 with extensions only if ip checksum was requested on this packet. l5 packet type identification is not valid for loop back packets. 4. header split & replication ? available only for part of the local packets. it is available only if the header split boundary is at the l4 level (tcp/udp), in cases where the tx side provided a valid l4 packet type (in packets for which l4 checksum is requested). in all other cases the sph is set to zero. 5. error bits ? the error bits are also fixed to zero, although most of the errors are not relevant for loop back packets. 6. rss ? when using rss for in pool queueing, local packets are sent to queue zero of the pool and the rss hash is not provided. 7. special queueing filters ? such as 5-tuple filter or ether-type filter are not applied to the local traffic. a driver using such filters should check if a packet belongs to a special queue and redirect it accordingly. 7.10.3.8.6 small packets padding in virtualized systems, the driver receiving the packet in the vm might not be aware of all the hardware off loads applied to the packet. thus, in case of stripping actions by the hardware (vlan strip), it might receive packets which are smaller than a legal packet. the 82576 provides an option to pad small packets in such cases so that all packets have a legal size. this option can be enabled only if the crc is stripped. in these cases, all packets are padded to 60 bytes (legal packet - 4 bytes crc). the padding is done with zero data. this function is enabled via the rctl.psp bit. 7.10.3.9 security features the 82576 allows some security checks on the inbound and outbound traffic of the switch. 7.10.3.9.1 inbound security each incoming packet (either from the lan or from a local vm) is filtered according to the vlan tag so that packets from one vlan can not be received by vms that are not members of that vlan.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 419 in cts enabled system, all packets are filtered according to their cts sgt tag to decide if a packet should be received by a given vf. tbd - ct s support 7.10.3.9.2 outbound security 7.10.3.9.2.1 anti spoofing the source mac address of each outgoing packet can be compared to the mac address the sending vm uses for packets reception. a packet with a non matching sa is dropped. thus preventing spoofing of the mac address. this feature is enabled in the dtxswc register, and can be enabled per vf. if vlan anti spoofing is set, a check is done to validate that sender is a member of the vlan set in the packet. if it is not, then the packet is dropped an a notification is sent to the vmm via the icr.mddet bit. this mode is controlled via the dtxwsc.vlanas field. note: anti spoofing is not available for vms that hides behind them other vms whose mac addresses are not part of the rah/ral mac address registers. in this case anti-spoofing should be done by the software switching handling these vms. 7.10.3.9.2.2 vlan insertion from register instead of descriptor there are cases, where the vlan should be inserted by the switch without intervention from the guest operating system. in next generation vmdq mode, where the physical driver is controlled by a trusted central entity, we can assume the software requests inserting the right tag. however, in iov scenarios, the driver might be malicious, and thus we can not assume it uses the right vlan tags. in order to overcome this issue, default vlan tags are defined per vm, and a default behavior is defined. the possible behaviors are: 1. use descriptor value ? to be used in case of a trusted vm that can decide which vlan to send. this option should be used also in case one vm is member of multiple vlans. 2. always insert default vlan ? this mode should be used for non trusted or non vlan aware vms. in this case any vlan insertion command from the vm is ignored. if a packet is received with a vlan, the packet should be dropped. 3. never insert vlan ? this mode should be used in non vlan network. in this case any vlan insertion command from the vm is ignored. if a packet is received with a vlan, the packet should be dropped. note: the vlan insertion settings should be done before any of the queues of the vm are enabled. 7.10.3.9.2.3 egress vlan filtering part of the vlans used by vmm vendors are vlan local to the virtualized server. packets sent with a private vlan should not forwarded to the external network. local vlans are indicated by setting the lvlan bit in the adequate vlvf entry. note: a packet with a local vlan tag whose destination is not in the server is dropped. this means that a local vlan should be confined to one physical port and can not have member vms connected to different ports even in the same nic. 7.10.3.9.3 interrupt misbehavior of vm.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 420 the hardware can be programmed to take some action as a result of some misbehavior of a vm. for example upon detection of a packet with a wrong source mac address, the hardware might block the packet. these actions might hint to the fact that some vm is malicious and the vmm should remedy to the situation. in order to inform the vmm of this fact, an interrupt is added to the icr register (wrong vm behavior bit) to indicate the occurrence of such an action. in addition the lvmmc register contains a bitmap of all the vms against whom some action was taken. this register is clear by read. the lvmiac register indicates what was the last misbehavior detected. 7.10.3.10 congestion control 7.10.3.10.1 receive priority as the switch might decide to loopback packets from the transmit path to the receive path, in case the receive path is full, the transmit path might be blocked (including the traffic to the lan). the 82576 guarantees that packets are not dropped. the pf driver might decide to program the 82576 to drop packets from receive queues without available descriptors. in order to keep the congestion effect locality, receive traffic from the lan have higher priority that loop back traffic. this way large loopback traffic does not impact the network. 7.10.3.10.2 queue arbitration and rate control in order to guarantee to each vm enough bandwidth, a per vm bandwidth control mechanism is added to the 82576. each vm gets an allocation of transmit bandwidth and is guaranteed it can transmit within the allocation. received packets can be either packets received from the network or loopback packets. packets received from the network are handled before loopback packets - no matter if the packets are unicast or replicated packets. 7.10.3.10.3 storm control as there is no separate path for multicast & broadcast packets, too much replicated packets might cause congestions in the data path. in order to avoid such scenarios, broadcast and multicast storm control rate limiters are added. the rate controllers defines windows and the maximal allowed number of multicast or broadcast bytes/packets per window. once the threshold is crossed different types of policies can be applied. 7.10.3.10.3.1 assumptions ? only one interval size and interval counter is used for both broadcast & multicast storm control mechanisms. ? the threshold and actions for each mechanism are separate. ? the traffic used to calculate the broadcast & multicast rate is all the traffic with a local destination - either tx or rx. ? the storm control does not block traffic to the network. ? the basic unit of traffic counted is 64 bytes of data.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 421 7.10.3.10.3.2 storm control functionality the time interval over which broadcast storm control is performed is controlled by three factors. ? scbi register ? port speed. ? the value in sccrl.interval the first two factors determine the unit time interval as described in table 7-72 . the interval is automatically chosen internal to hardware based on port speed. the third factor ( interval field) determines how many of such unit intervals are considered for one storm control interval. the number of 64 bytes chunk of broadcast or multicast packets that are allowed in a given interval is determined by setting the bsctrh or msctrh register respectively. the 82576 supports two modes of reactions to storm event: 1. block all packets for the moment the threshold is crossed until the end of the interval. the block is removed at the end of the interval until the threshold is crossed again. this mode is set by asserting sccrl.mdicw (for multicast) or sccrl.bdicw (for broadcasts). this mode is used as a rate limiter. 2. block all packets for the moment the threshold is crossed until a full interval without threshold crossing is registered. the block is removed at the end of the interval until the threshold is crossed again. this mode is set by asserting sccrl.mdicw and sccrl.mdipw (for multicast) or sccrl.bdicw and sccrl.bdipw (for broadcasts). this mode is used for storm blocking. the 82576 might consider all packets for which a queue was not found. for example, packets that passed the 1st stage of l2 filtering but didn?t pass the 2nd stage of pooling, or where sent to the default pool, as broadcast packets. this mode is activated by setting the sccrl.bidu field. any change in the storm control state (block or pass of multicast or broadcast packets) is indicated to the software via the icr.sce interrupt cause. the current state is reflected in the scsts register. for diagnostic purpose only, the storm control timer and counters can be read via the sctc, msccnt & bsccnt registers. 7.10.3.11 external switch loopback support one of the long term solutions for the switching issue is a mode where an external switch would do the loopback of vf to vf traffic and the nic is responsible for the replication of multicast packets only. in order to support this mode, the internal loopback mode should be disabled and received packets sa should be compared to the exact mac addresses to check if the packet originated from a local source, so that the packet is not forwarded to the originator. this mode is enabled by the vt_ctl.flp bit. table 7-72. storm control interval by speed port speed min time interval max time interval 1 gb/s 100 ? s 100 ms 100 mb/s 1 ms 1 s 10 mb/s 10 ms 10 s
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 422 7.10.3.12 switch control the pf driver have some control of the switch logic. the following registers are available to the pf for this purpose: ? vlvf: vlan queuing table: a set of 32 vlan entries with an associated per vf bit map allowing allocation of each vf to each of the 32 vlan tags. sgtta: sgt table: a set of tbd sgt entries with an associated per vf bit map allowing allocation of each vf to each of the tbd sgt tags. dtxswc: dm tx switch control register - controls the security setting of the switch such as mac & vlan anti spoof filters, local loopback enable and the loopback enable mode. qde: queue drop enable register(s): a register defining wether receive packets destined to a specific queue is dropped if no descriptor are available. this register overrides the individual srrctl.drop_en bits. vt_ctl: vt control register - contains the following fields: ? replication enable - allows replication of multicast & broadcast packets - both in incoming & outgoing traffic. if this bit is cleared, tx multicast & broadcast packets are sent only to the network and rx multicast & broadcast packets are sent to the default vm. ? default pool - defines where to send packets that passed l2 filtering but didn?t pass any of the queueing mechanisms. ? default pool disable- defines whether to drop packets that passed l2 filtering but didn?t pass any of the queueing mechanisms. vmvir: a set of registers used to control vlan insertion of outgoing packets. vmolr/rpmolr: defines the offloads and pool selection options for each vf and for replicated packets. in addition the storm control mechanism is programmed as described in section 7.10.3.10.3.2 , security features are described in section 7.10.3.9 and the rate control mechanism is programmed. 7.10.4 virtualization of the hardware this section describes additional features used in both iov & next generation vmdq modes. 7.10.4.1 per pool statistics part of the statistics are by definition shared and can not be allocated to a specific vm. for example, crc error count can not be allocated to a specific vm, as the destination of such a packet is not known if the crc is wrong. all the non specific statistics is handled by the pf driver in the same way it is done in non virtualized systems. a vm might require a statistic from the pf driver but might not access it directly. the conceptual model used to gather statistics in a virtualization context is that each queue pool is considered as a virtual link and the ethernet link is considered as the uplink of the switch. thus any packet sent by a vm is counted in the tx statistics, even if it was forwarded to another vm internally or was dropped by the mac from some reason. in the same way, a replicated packet is counted in each of the vm receiving it.
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 423 the following statistics are be provided per vm: 1. good packet received count. 2. good packet transmitted count. 3. good octets received count. 4. good octets transmitted count. 5. rx packet dropped because of queue descriptors not available (per queue). 6. multicast packets received count 7. good packet received from local vm count. 8. good packet transmitted to local vm count. 9. good octets received from local vm count. 10. good octets transmitted to local vm count. note: all the per vf statistics are ro and wrap around after reaching their maximal value. 7.11 time sync (ieee1588 and 802.1as) 7.11.1 overview measurement and control applications are increasingly using distributed system technologies such as network communication, local computing, and distributed objects. many of these applications are enhanced by having an accurate system wide sense of time achieved by having local clocks in each sensor, actuator, or other system device. without a standardized protocol for synchronizing these clocks, it is unlikely that the benefits are realized in the multi vendor system component market. existing protocols for clock synchronization are not optimum for these applications. for example, network time protocol (ntp) targets large distributed computing systems with millisecond synchronization requirements. the 1588 standard specifically addresses the needs of measurement and control systems: ? spatially localized ? microsecond to sub microsecond accuracy ? administration free ? accessible for both high-end devices and low-cost, low-end devices the time sync mechanism activation is possible in full duplex mode only. no limitations on the wire speed although the wire speed might affect the accuracy. 7.11.2 flow and hardware/software responsibilities the operation of a ptp (precision time protocol) enabled network is divided into two stages, initialization and time synchronization. at the initialization stage every master enabled node starts by sending sync packets that include the clock parameters of its clock. upon reception of a sync packet a node compares the received clock parameters to its own. if the received clock parameters are better, than available on this node, the node moves to slave state and stops sending sync packets. when in slave state the node continuously compares the incoming packet clock parameters to its currently chosen master. if the new clock parameters are better then the current master selection, it changes master clock source. eventually the
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 424 best master clock source is chosen. every node has a defined sync packet time-out interval. if no sync packet is received from its chosen master clock source during the interval it moves back to master state and starts sending sync packets until a new best master clock (bmc) is chosen. the time synchronization stage is different for master and slave nodes. if a node is in master state it should periodically send a sync packet which is time stamped by hardware on the transmit path (as close as possible to the phy). after the sync packet a follow_up packet is sent which includes the value of the timestamp kept from the sync packet. in addition the master should timestamp delay_req packets on its rx path and return to the slave that sent it the timestamp value using a delay_response packet. a node in slave state should timestamp every incoming sync packet that is received from its selected master, software uses this value for time offset calculation. in addition it should periodically send delay_req packets in order to calculate the path delay from its master. every sent delay_req packet sent by the slave is time stamped and kept. using the value received from the master delay_response packet the slave can now calculate the path delay from the master to the slave. the synchronization protocol flow and the offset calculation are described in figure 7-31 . the hardware?s responsibilities are: 1. identify the packets that require time stamping. 2. time stamp the packets on both receive and transmit paths. 3. store the time stamp value for software. 4. keep the system time in hardware and give a time adjustment service to the software. 5. maintain auxiliary features related to the system time. the software?s responsibilities are: 1. bmc protocol execution which means defining the node state (master or slave) and selection of the master clock if in slave state. 2. generate ptp packets, consume ptp packets. 3. calculate the time offset and adjust the system time using hardware mechanism. 4. enable configuration and usage of the auxiliary features. figure 7-31. sync flow and offset calculation
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 425 7.11.2.1 timesync indications in receive and transmit packet descriptors some indications need to be transferred between software and hardware regarding ptp packets. on the receive path the hardware transfers two indications to software in the receive descriptor: 1. an indication in rdesc.packet type that this packet is a ptp packet (no matter if timestamp is sampled or not). this indication is used also by ptp packets required for protocol management. note: this indication is only relevant for l2 type packets (the ptp packet is identified according to its ethertype). ptp packets have the l2type bit in the packet type field set (bit 11) and the etype matches the filter number set by the software to filter ptp packets. udp type ptp packets don?t require such an indication since the port number (319 for event and 320 for all other ptp packets) directs the packets toward the time sync application. 2. a second indication in the rdesc.status. ts bit to indicate to the software that time stamp was taken for this packet. software needs to access the time stamp registers to get the time stamp values. 7.11.3 hardware time sync elements all time sync hardware elements are reset to their initial values as defined in the registers section upon mac reset. 7.11.3.1 system time structure and mode of operation the time sync logic contains an up counter to maintain the system time value. this is a 64 bit counter that is built of the systiml and systimh registers. when in master state the systimh and systiml registers should be set once by the software according to the general system requirements, when in slave state software should update the system time on every sync event as described in section 7.11.3.3 . setting the system time is done by direct write to the systimh register and fine tune table 7-73. chronological order of events for sync and path delay action responsibility node role generate a sync packet with timestamp notification in descriptor software master timestamp the packet and store the value in registers (t1) hardware master timestamp incoming sync packet, store the value in register and store the sourceid and sequenceid in registers (t2) hardware slave read the timestamp from register, prepare a follow_up packet and send software master once follow_up packet is received, load t2 from registers and t1 from follow_up packet software slave generate a delay_req packet with timestamp notification in descriptor software slave timestamp the packet and store the value in registers (t3) hardware slave timestamp incoming delay_req packet, store the value in register and store the sourceid and sequenceid in registers (t4) hardware master read the timestamp from register and send back to slave using a delay_response packet software master once delay_response packet is received, calculate offset using t1, t2, t3 and t4 values software slave
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 426 setting of the systiml register using the adjustment mechanism described in section 7.11.3.3 . read access to the systimh and systiml registers should be executed in the following order: 1. software read register systiml (at this stage the hardware latch the value of systimh). 2. software read register systimh (the latched value (from last read from systiml) is returned by the hardware). upon an increment event the system time value should be incremented by the value stored in timinca.incvalue. increment event happens every timinca.incperiod cycles. if the timinca.incperiod cycles value is one, then the increment event should occur every clock cycle. the incvalue defines the time granularity represented by the systimh/l registers. for example if the cycle time is 16 ns and the incperiod is one, then if the incvalue is 16 then the systmh/l time is represented in nanoseconds. if the incvalue is 160 then the time is represented in 0.1ns units and so on. the incperiod helps to avoid inaccuracy in cases where t value can not be represented as a simple integer and should be multiplied to get to an integer representation. the incperiod value should be as small as possible to achieve best accuracy possible. note: best accuracy is achieved at lowest permitted ?incperiod? equals to 1 and as high as possible incvalue. 7.11.3.2 time stamping mechanism the time stamp logic is located on transmit and receive paths at a location as close as possible to the phy, to reduce delay uncertainties originating from implementation differences. the time stamp logic operation is slightly different on transmit and on receive paths. the transmit logic decides to timestamp a packet if the transmit timestamp is enabled ( tsynctxctl.en = 1) and the time stamp bit in the packet descriptor ( tdesd.mac.1588 = 1) is set. on the transmit side only the time is captured in the txstmpl and txstmph registers. the receive logic parses the received frame and if it is matches the ctrlt or msgt message types defined in the tsyncrxcfg register (see section 8.17.23 ) the time, sourceid and sequenceid are latched in the rxstmpl, rxstmph, rxsatrl and rxsatrh timestamp registers. in addition two indications in the receive descriptor are placed: 1. rdesc.packet type - value in this field identifies that this is a ptp packet (this indication is only for l2 packets since on the udp packets the port number directs the packet to the application). 2. rdesc.status. ts - bit identifies that a time stamp was taken for this packet. if this ptp packet is not sync or delay_req or for some reason the time stamp was not taken the ts bit is not set. for more details please refer to the timestamp registers in section 8.16 . figure 7-32 defines the exact point where the time value is captured. on both transmit and receive sides the timestamp values are locked in registers until values are read by software. as a result if a new ptp packet that requires time stamp arrives before software access it is not time stamped. in some cases in the receive path a packet that was timestamped might be lost and not reach the host. to avoid a deadlock condition on the time stamp registers the software should keep
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 427 a watch dog timer to clear locking of the time stamp register. the interval counted by such a timer should be at least higher then the expected interval between two sync or delay_req packets depends on the node state (master or slave).. 7.11.3.3 time adjustment mode of operation a node in a time sync network can be in one of two states master or slave. when a time sync entity is in the master state it should synchronize other entities to its system clock. in this case no time adjustments are needed. when the entity is in slave state it should adjust its system clock by using the data arriving in the follow_up and delay_response packets and the time stamp values of the sync and delay_req packets. when all the values are available the software in the slave entity can calculate its offset in the following manner: toffset = [(t2-t1) - (t3-t4)]/2 t1 - timing data in follow_up packet t2 - sync time stamp t3 - delay_req time stamp t4 - timing data in delay_response packet after offset calculation the system time register should be updated. this is done by writing the calculated offset to the timadjl and timadjh registers. the order should be as follows: 1. write the low portion of the offset to timadjl. 2. write the high portion of the offset to timadjh to the lower 31 bits and the sign to the most significant bit. after the write cycle to timadjh the value of timadjh and timadjl is added to the system time. 7.11.4 time sync related auxiliary elements the time sync logic implements two types of auxiliary elements using the precise system timer. 7.11.4.1 target time the two target time registers trgttiml/h0 and trgttiml/h1 enable to generate a time triggered event to external hardware using one of the sdp pins. each target time register is structured the same as the system time register. if the value of the system time is equal to the value written to one of the target time registers a change in level occurs on one of the sdp outputs. the accuracy of the comparison is defined by the value of the tsauxc . mask described in section 8.17.14 . each target time figure 7-32. time stamp point
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 428 register has an enable bit located in the auxiliary control register ( tsauxc.en_tti ). upon getting a target time event the enable bit is cleared and needs to be set again by software to get another target time event. 7.11.4.2 time stamp events upon a change in the input level of one of the sdp pins that was configured to detect time stamp events using the tssdp register, a time stamp of the system time is captured into one of the two auxiliary time stamp registers (auxstmpl/h0 or auxstmpl/h1). 7.11.5 ptp packet structure the time sync implementation supports both the 1588 v1 and v2 ptp frame formats. the v1 structure can come only as udp payload over ipv4 while the v2 can come over l2 with its ethertype or as a udp payload over ipv4 or ipv6. the 802.1as draft standard implementation uses only the layer 2 v2 format. note: it is assumed that time sync v1 packets are never protected by ipsec. table 7-74. v1 and v2 ptp message structure offset in bytes v1 fields v2 fields bits 7 0 7 4 3 0 0 versionptp transport specific messageid 1 reserved versionptp 2 version network message length 3
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 429 4 subdomain subdomain number 5 reserved 6 flags 7 8 correctionns 9 10 11 12 13 14 correctionsubns 15 16 reserved 17 18 19 20 message type reserved 21 source communication technology source communication technology 22 sourceuuid sourceuuid 23 24 25 26 27 28 source port id source port id 29 30 sequenceid sequenceid 31 32 control control table 7-74. v1 and v2 ptp message structure (continued) offset in bytes v1 fields v2 fields bits 7 0 7 4 3 0
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 430 note: only the fields with the bold italic format are of interest to the hardware. when a ptp packet is recognized (by ethertype or udp port address) on the receive side then if the version is v1, then the control field at offset 32 should be compared to the tsyncrxcfg.ctrlt message field (see section 8.17.23 ); otherwise the byte at offset zero should be used for comparison to the tsyncrxcfg.msgt field. the rest of the required fields are at the same location and size for both v1 and v2. 33 reserved log message period 34 flags n/a 35 table 7-75. ptp message over layer 2 ethernet (l2) vlan (optional) ptp ethertype ptp message table 7-76. ptp message over layer 4 ethernet (l2) ip (l3) udp ptp message table 7-77. message decoding for v1 (control field at offset 32) enumeration value ptp_sync_message 0 ptp_delay_req_message 1 ptp_followup_message 2 ptp_delay_resp_message 3 ptp_management_message 4 reserved 5?255 table 7-78. message decoding for v2 (messageid field at offset 0) messageid message type value (hex) ptp_sync_message event 0 ptp_delay_req_message event 1 ptp_path_delay_req_message event 2 ptp_path_delay_resp_message event 3 unused 4-7 ptp_followup_message general 8 table 7-74. v1 and v2 ptp message structure (continued) offset in bytes v1 fields v2 fields bits 7 0 7 4 3 0
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 431 if v2 mode is configured in the tsyncrxctl.type field (see section 8.17.1 ) then the time stamp should be taken on ptp_path_delay_req_message and ptp_path_delay_resp_message according to the value in the tsyncrxcfg .msgt message field described in section 8.17.23 . 7.12 statistics the 82576 supports different statistics counters as described in section 8.19 . the statistics counters can be used to create statistics reports as required by different standards. the 82576 statistics allows support for the following standards: ? ieee 802.3 clause 30 management ? dte section. ? ndis 6.0 oid_gen_statistics. ? rfc 2819 ? rmon ethernet statistics group. ? linux kernel (version 2.6) net_device_stats ? ieee 802.1ae (macsec) secy management statistics. the following section describes the match between the internal the 82576 statistic counters and the counters requested by the different standards. 7.12.1 ieee 802.3 clause 30 management the 82576 supports the basic and mandatory packages defined in clause 30 of the ieee 802.3 spec. the following table describes the matching between the internal statistics and the counters requested by these packages. ptp_delay_resp_message general 9 ptp_path_delay_followup_message general a ptp_announce_message general b ptp_signalling_message general c ptp_management_message general d unused e-f table 7-79. ieee 802.3 mandatory package statistics mandatory package capability intel? 82576 gbe controller counter notes and limitations framestransmittedok gptc the 82576 doesn?t include flow control packets. singlecollisionframes scc multiplecollisionframes mcc table 7-78. message decoding for v2 (messageid field at offset 0) messageid message type value (hex)
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 432 in addition, part of the recommended package is also implemented as described in the following table part of the optional package is also implemented as described in the following table framesreceivedok gprc the 82576 doesn?t include flow control packets. framechecksequenceerrors crcerrs alignmenterrors algnerrc table 7-80. ieee 802.3 recommended package statistics recommended package capability intel? 82576 gbe controller counter notes and limitations octetstransmittedok gotch/gotcl the 82576 counts also the da/sa/lt/crc as part of the octets. the 82576 doesn?t count flow control packets. frameswithdeferredxmissions dc latecollisions latecol framesabortedduetoxscolls ecol frameslostduetointmacxmiterror htdmpc the 82576 counts the excessive collisions in this counter, while 802.3 increments no other counters, while this counter is incremented carriersenseerrors tncrs the 82576 doesn?t count cases of crs de-assertion in the middle of the packet. however, such cases are not expected when the internal phy is used. octetsreceivedok torl+torh counts also the da/sa/lt/crc as part of the octets. doesn?t count flow control packets. frameslostduetointmacrcverror rnbc sqetesterrors n/a maccontrolframestransmitted n/a maccontrolframesreceived n/a unsupportedopcodesreceived fcurc pausemacctrlframestransmitted xontxc + xofftxc pausemacctrlframesreceived xonrxc + xoffrxc table 7-81. ieee 802.3 optional package statistics optional package capability intel? 82576 gbe controller counter notes multicastframesxmittedok mptc intel? 82576 gbe controller doesn?t count fc packets broadcastframesxmittedok bptc multicastframesreceivedok mprc intel? 82576 gbe controller doesn?t count fc packets broadcastframesreceivedok bprc table 7-79. ieee 802.3 mandatory package statistics (continued)
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 433 7.12.2 oid_gen_statistics the 82576 supports the part of the oid_gen_statistics as defined by microsoft* ndis 6.0 spec. the following table describes the matching between the internal statistics and the counters requested by this structure. 7.12.3 rmon the 82576 supports the part of the rmon ethernet statistics group as defined by ietf rfc 2819. the following table describes the matching between the internal statistics and the counters requested by this group. inrangelengtherrors lenerrs outofrangelengthfield n/a packets parsed as ethernet ii packets frametoolongerrors roc + rjc table 7-82. microsoft* oid_gen_statistics oid entry intel? 82576 gbe controller counters notes ifindiscards; crcerrs + rlec + rxerrc + mpc + rnbc + algnerrc ifinerrors; crcerrs + rlec + rxerrc + algnerrc ifhcinoctets; gorcl/gotcl ifhcinucastpkts; gprc - mprc - bprc ifhcinmulticastpkts; mprc ifhcinbroadcastpkts; bprc ifhcoutoctets; gotcl/gotch ifhcoutucastpkts; gptc - mptc - bptc ifhcoutmulticastpkts; mptc ifhcoutbroadcastpkts; bptc ifouterrors; ecol + latecol ifoutdiscards; ecol ifhcinucastoctets; n/a ifhcinmulticastoctets; n/a ifhcinbroadcastoctets; n/a ifhcoutucastoctets; n/a ifhcoutmulticastoctets; n/a ifhcoutbroadcastoctets; n/a table 7-81. ieee 802.3 optional package statistics (continued)
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 434 7.12.4 linux net_device_stats the 82576 supports part of the net_device_stats as defined by linux kernel version 2.6 (defined in ). the following table describes the matching between the internal statistics and the counters requested by this structure. / table 7-83. rmon statistics rmon statistic intel? 82576 gbe controller counters notes etherstatsdropevents mpc + rnbc etherstatsoctets totl + toth etherstatspkts tpr etherstatsbroadcastpkts bprc etherstatsmulticastpkts mprc the 82576 don?t count fc packets etherstatscrcalignerrors crcerrs + algnerrc etherstatsundersizepkts ruc etherstatsoversizepkts roc etherstatsfragments rfc should count bad aligned fragments as well etherstatsjabbers rjc should count bad aligned jabbers as well etherstatscollisions colc etherstatspkts64octets prc64 rmon counts bad packets as well etherstatspkts65to127octets prc127 rmon counts bad packets as well etherstatspkts128to255octets prc255 rmon counts bad packets as well etherstatspkts256to511octets prc511 rmon counts bad packets as well etherstatspkts512to1023octets prc1023 rmon counts bad packets as well etherstatspkts1024to1518octets prc1522 rmon counts bad packets as well table 7-84. linux net_device_stats net_device_stats field intel? 82576 gbe controller counters notes rx_packets gprc the 82576 doesn?t count flow controls - can be accounted for by using the xonrxc and xoffrxc counters tx_packets gptc the 82576 doesn?t count flow controls - can be accounted for by using the xontxc and xofftxc counters rx_bytes gorcl + gorch tx_bytes gotcl + gotch rx_errors crcerrs + rlec + rxerrc + algnerrc tx_errors ecol + latecol
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 435 7.12.5 macsec statistics the 82576 supports the keyy management statistics defined in chapter 10 of the ieee 802.1ae d5.1 spec or in the ieee8021-secy-mib described in chapter 13 of the same document. 7.12.6 rx statistics upon reception of a packet one and only one of the statistics in table 7-85 rises. the precedence order of the statistics is also defined in this table. rx_dropped n/a tx_dropped n/a multicast mptc collisions colc rx_length_errors rlec rx_over_errors n/a rx_crc_errors crcerrs rx_frame_errors algnerrc rx_fifo_errors hrmpc rx_missed_errors mpc tx_aborted_errors ecol tx_carrier_errors n/a tx_fifo_errors n/a tx_heartbeat_errors n/a tx_window_errors latecol rx_compressed n/a tx_compressed n/a table 7-85. rx packet statistics. register name 802.1ae name priority notes lsecrxut inpktsuntagged 1 1 used in check mode. packet is forwarded to host. lsecrxut inpktsnotag 1 1 used in strict mode. packet is dropped (unless it is a kay packet). lsecrxbad inpktsbadtag 2 packet is dropped in strict mode or in check mode when c bit is one. 2 lsecrxunsci inpktsunknownsci 3 used only in check mode. packet is forwarded to host if c bit is zero. lsecrxnosci inpktsnosci 3 packet is dropped in strict mode or in check mode when c bit is one. table 7-84. linux net_device_stats (continued)
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 436 upon transmission of a packet one and only one of the statistics in table 7-86 rises. table 7-87 describes the correspondence between the 82576 macsec per octet statistics and their counterpart in the 802.1ae spec. lsecrxunsa inpktsunusedsa 4 used only in check mode. packet is forwarded to host if c bit is zero. this statistic reflects the sum of inpktsunusedsa for all sas. per sa inpktsunusedsa statistics are not implemented. lsecrxnusa inpktsnotusingsa 4 packet is dropped in strict mode or in check mode when c bit is one. this statistic reflects the sum of inpktsnotusingsa for all sas. per sa inpktsnotusingsa statistics are not implemented. lsecrxlate inpktslate 5 packet is dropped. n/a inpktsoverrun n/a the 82576 supports wire speed decryption; this field is not used. lsecrxnv[sa#] inpktsnotvalid 6 packet is dropped in strict mode or in check mode when c bit is one. lsecrxinv[sa#] inpktsinvalid 6 used only in check mode. packet is forwarded to host if c bit is zero. lsecrxdelay inpktsdelayed 7 packet is forwarded to host. lsecrxunch inpktsunchecked 8 packet is forwarded to host. lsecrxok[sa#] inpktsok 9 packet is forwarded to host. 1. the lsecrxut register do not reflect this statistic exactly, as it counts the kay packets in addition to the untagged packet. to get the exact statistics, the driver should subtract the kay packets received from the lsecrxut value. 2. in the discussion, ?e bit? refers to the packet encryption bit which is set when packet is encrypted. ?c bit? refers to the c hanged bit which is set when the data was changed (or encrypted). table 7-86. tx packet statistics register name 802.1ae name lsectxut outpktsuntagged lsectxpkte outpktsencrypted lsectxpktp outpktsprotected table 7-87. octet statistics register name 802.1ae name lsecrxocte inoctetsdecrypted lsecrxoctp inoctetsvalidated lsectxocte outoctetsencrypted lsectxoctp outoctetsprotected table 7-85. rx packet statistics. register name 802.1ae name priority notes
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 437 7.12.7 statistics hierarchy. the following diagrams describes the relations between the packet flow and the different statistic counters. figure 7-33. tx flow statistics
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 438 figure 7-34. rx flow statistics
inline functions ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 439 figure 7-35.
intel ? 82576eb gbe controller ? inline functions intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 440 note: this page intentionally left blank.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 441 8.0 programming interface 8.1 introduction this chapter details the programmer visible state inside the 82576. in some cases, it describes hardware structures invisible to software in order to clarify a concept. the 82576's address space is mapped into four regions with pci base address registers described in section 9.4.11 . these regions are listed in the table below. both the flash and expansion rom base address registers map the same flash memory. the internal registers and memories and flash can be accessed though i/o space indirectly as explained below. the internal register/memory space is described in the following sections. the phy registers are accessed through the mdio interface. 8.1.1 memory and i/o address decoding 8.1.1.1 memory-mapped access to internal registers and memories the internal registers and memories might be accessed as direct memory-mapped offsets from the base address register (bar0 or bar 0/1 see section 9.4.11 ). see section 8.1.3 for the appropriate offset for each specific internal register. in iov mode, this area is partially duplicated per vf. all replications contain only the subset of the register set that is available for vf programming. table 8-1. address space regions addressable content how mapped size of region internal registers and memories direct memory-mapped 128k flash (optional) direct memory-mapped 64-512k expansion rom (optional) direct memory-mapped 64-512k internal registers and memories, flash (optional) i/o window mapped 32 bytes msi-x (optional) direct memory-mapped 16k
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 442 8.1.1.2 memory-mapped access to flash the external flash might be accessed using direct memory-mapped offsets from the flash base address register (bar1 or bar2/3 see section 9.4.11 ). the flash is only accessible if enabled thorough the eeprom initialization control word, and if the flash base address register contains a valid (non-zero) base memory address. for accesses, the offset from the flash bar corresponds to the offset into the flash actual physical memory space. 8.1.1.3 memory-mapped access to msi-x tables the msi-x tables might be accessed as direct memory-mapped offsets from the base address register (bar3 or bar 4/5 see section 9.4.11 ). in iov mode, this area is duplicated per vf. 8.1.1.4 memory-mapped access to expansion rom the external flash might also be accessed as a memory-mapped expansion rom. accesses to offsets starting from the expansion rom base address (see section 9.4.11 ) reference the flash provided that access is enabled thorough the eeprom initialization control word, and if the expansion rom base address register contains a valid (non-zero) base memory address. 8.1.1.5 i/o-mapped access to internal registers, memories, and flash to support pre-boot operation (prior to the allocation of physical memory base addresses), all internal registers, memories, and flash can be accessed using i/o operations. i/o accesses are supported only if an i/o base address is allocated and mapped (bar2 see section 9.4.11 ), the bar contains a valid (non-zero value), and i/o address decoding is enabled in the pcie configuration. when an i/o bar is mapped, the i/o address range allocated opens a 32-byte ?window? in the system i/o address map. within this window, two i/o addressable registers are implemented: ioaddr and iodata. the ioaddr register is used to specify a reference to an internal register, memory, or flash, and then the iodata register is used as a ?window? to the register, memory or flash address specified by ioaddr: 8.1.1.5.1 ioaddr (i/o offset 0x00) table 8-2. ioaddr and iodata offset ab name rw size 0x00 ioaddr internal register, internal memory, or flash location address. 0x00000-0x1ffff ? internal registers and memories 0x20000-0x7ffff ? undefined 0x80000-0x87ffff ? flash rw 4 bytes 0x04 iodata data field for reads or writes to the internal register, internal memory, or flash location as identified by the current value in ioaddr. all 32 bits of this register are read/write-able. rw 4 bytes 0x08 ? 0x1f reserved reserved ro 4 bytes
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 443 the ioaddr register must always be written as a dword access. writes that are less than 32 bits is ignored. reads of any size returns a dword of data. however, the chipset or cpu might only return a subset of that dword. for software programmers, the in and out instructions must be used to cause i/o cycles to be used on the 3giopcie bus. because writes must be to a 32-bit quantity, the source register of the out instruction must be eax (the only 32-bit register supported by the out command). for reads, the in instruction can have any size target register, but it is recommended that the 32-bit eax register be used. because only a particular range is addressable, the upper bits of this register are hard coded to zero. bits 31 through 20 are not write-able and always read back as 0b. at hardware reset (internal_power_on_reset) or pci reset, this register value resets to 0x00000000. once written, the value is retained until the next write or reset. 8.1.1.5.2 iodata (i/o offset 0x04) the iodata register must always be written as a dword access when the ioaddr register contains a value for the internal register and memories (for example, 0x00000-0x1fffc). in this case, writes that are less than 32 bits is ignored. the iodata register might be written as a byte, word, or dword access when the ioaddr register contains a value for the flash (for example, 0x80000-0xfffff). in this case, the value in ioaddr must be properly aligned to the data value. this table identifies the supported configurations: note: software might have to implement non-obvious code to access the flash at a byte or word at a time. example code that reads a flash byte is shown here to illustrate the impact of the above table: char *ioaddr; char *iodata; ioaddr = iobase + 0; iodata = iobase + 4; *(ioaddr) = flash_byte_address; table 8-3. intel? 82576 gbe controller ioaddr bits access type intel? 82576 gbe controller ioaddr register bits [1:0] target iodata access be[3:0]# bits in data phase byte (8 bit) 00b 1110b 01b 1101b 10b 1011b 11b 0111b word (16 bit) 00b 1100b 10b 0011b dword (32 bit) 00b 0000b
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 444 read_data = *(iodata + (flash_byte_address % 4)); reads to iodata of any size returns a dword of data. however, the chipset or cpu might only return a subset of that dword. for software programmers, the in and out instructions must be used to cause i/o cycles to be used on the pcie bus. where 32-bit quantities are required on writes, the source register of the out instruction must be eax (the only 32-bit register supported by the out command). writes and reads to iodata when the ioaddr register value is in an undefined range (0x20000- 0x7fffc) should not be performed. results cannot be determined. note: there are no special software timing requirements on accesses to ioaddr or iodata. all accesses are immediate, except when data is not readily available or acceptable. in this case, the 82576 delays the results through normal bus methods (for example, split transaction or transaction retry). note: because a register/memory/flash read or write takes two io cycles to complete, software must provide a guarantee that the two io cycles occur as an atomic operation. otherwise, results can be non-deterministic from the software viewpoint. 8.1.1.5.3 undefined i/o offsets i/o offsets 0x08 through 0x1f are considered to be reserved offsets with the i/o window. dword reads from these addresses returns 0xffff; writes to these addresses is discarded. 8.1.2 register conventions all registers in the 82576 are defined to be 32 bits, should be accessed as 32 bit double-words, there are some exceptions to this rule: ? register pairs where two 32 bit registers make up a larger logical size ? accesses to flash memory (via expansion rom space, secondary bar space, or the i/o space) might be byte, word or double word accesses. reserved bit positions: some registers contain certain bits that are marked as ?reserved?. these bits should never be set to a value of ?one? by software. reads from registers containing reserved bits might return indeterminate values in the reserved bit-positions unless read values are explicitly stated. when read, these reserved bits should be ignored by software. reserved and/or undefined addresses: any register address not explicitly declared in this specification should be considered to be reserved, and should not be written to. writing to reserved or undefined register addresses might cause indeterminate behavior. reads from reserved or undefined configuration register addresses might return indeterminate values unless read values are explicitly stated for specific addresses. initial values: most registers define the initial hardware values prior to being programmed. in some cases, hardware initial values are undefined and is listed as such via the text ?undefined?, ?unknown?, or ?x?. such configuration values might need to be set via eeprom configuration or via software in order for proper operation to occur; this need is dependent on the function of the bit. other registers might cite a hardware default which is overridden by a higher-precedence operation. operations which might supersede hardware defaults might include a valid eeprom load, completion of a hardware operation (such as hardware auto-negotiation), or writing of a different register whose value is then reflected in another bit.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 445 for registers that should be accessed as 32 bit double words, partial writes (less than a 32 bit double word) does not take effect (the write is ignored). partial reads returns all 32 bits of data regardless of the byte enables. notes: partial reads to read-on-clear registers (icr) can have unexpected results since all 32 bits are actually read regardless of the byte enables. partial reads should not be done. all statistics registers are implemented as 32 bit registers. though some logical statistics registers represent counters in excess of 32-bits in width, registers must be accessed using 32-bit operations (for example, independent access to each 32-bit field). when reading 64 bits statistics registers the least significant 32 bit register should be read first. see special notes for vlan filter table, multicast table arrays and packet buffer memory which appear in the specific register definitions. the 82576 register fields are assigned one of the attributes described in table 8-4 . phy registers described in section 8.2.5 use a special nomenclature to define the read/write mode of individual bits in each register. see table bellow for details. table 8-4. intel? 82576 gbe controller register field attributes attribute description rw read-write field: register bits are read-write and can be either set or cleared by software to the desired state. rws read-write status field: register bits are read-write and can be either set or cleared by software to the desired state. however, the value of this field might be changed by the hardware to reflect a status change. ro read-only register: register bits are read-only and cannot be altered by software. register bits might be initialized by hardware mechanisms such as pin strapping, serial eeprom or reflect a status of the hardware state. r/w1c read-only status, write-1-to-clear status register: register bits indicate status when read, a set bit indicating a status event can be cleared by writing a 1b. writing a 0b to r/w1c bit has no effect. rsv reserved.do not write to these fields. rc read-only status, read-to-clear status register: register bits indicate status when read, a set bit indicating a status event is cleared by reading it. sc self clear field: a command field that is self clearing. these field are always read as zero. wo write only field: a command field that can not be read, these field read values are undefined. rc/w1c read-only status, write-1-to-clear status register: read-to-clear status register register bits indicate status when read, a set bit indicating a status event can be cleared by writing a 1b or by reading the register. writing a 0b to rc/w1c bit has no effect. rs read set ? this is the attribute used for semaphore bits. these bits are set by read in case the previous values were zero. in this case the read value is zero; otherwise the read value is one. cleared by write zero. table 8-5. phy register nomenclature register mode description lh latched high. event is latched and erased when read. ll latched low. event is latched and erased when read. for example, link loss is latched when the phy control register bit 2 = 0b. after read, if the link is good, the phy control register bit 2 is set to 1b. ro read only. r/w read and write.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 446 note: for all binary equations appearing in the register map, the symbol ?|? is equivalent to a binary or operation. 8.1.2.1 registers byte ordering this section defines the structure of registers that contain fields carried over the network. some examples are l2, l3, l4 fields, macsec fields, and ipsec fields. the following example is used to describe byte ordering over the wire (hex notation): last first ...,06, 05, 04, 03, 02, 01, 00 each byte is sent with the lsbit first. that is, the bit order over the wire for this example is last first ..., 0000 0011, 0000 0010, 0000 0001, 0000 0000 the general rule for register ordering is to use host ordering (also called little endian). using the above example, a 6-byte fields (mac address) is stored in a csr in the following manner: byte 3 byte 2 byte 1 byte0 dw address (n) 0x03 0x02 0x01 0x00 dw address (n+4) ... ... 0x05 0x04 the exceptions listed below use network ordering (also called big endian). using the above example, a 16-bit field (ethertype) is stored in a csr in the following manner: byte 3 byte 2 byte 1 byte0 (dw aligned) ... ... 0x00 0x01 or (word aligned) 0x00 0x01 ... ... the following exceptions use network ordering: all ethertype fields. for example, the vet ext field in the vet register, the etype field in the etqf register, the cmt_eth in the rttbcncr register, the etype field in the metf register. sc self-clear. the bit is set, automatically executed, and then reset to normal operation. cr clear after read. for example, 1000base-t status register bits 7:0 (idle error counter). update value written to the register bit does not take effect until software phy reset is executed. table 8-5. phy register nomenclature (continued) register mode description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 447 note: the ?normal? notation as it appears in text books, etc. is to use network ordering. example: suppose a mac address of 00-a0-c9-00-00-00. the order on the network is 00, then a0, then c9, etc. however, the host ordering presentation would be: byte 3 byte 2 byte 1 byte0 dw address (n) 00 c9 a0 00 dw address (n+4) ... ... 00 00 8.1.3 register summary all the 82576's non-pcie configuration registers, except for the msi-x register, are listed in the table below. these registers are ordered by grouping and are not necessarily listed in order that they appear in the address space. in an iov system, this list refers to the pf registers, the vf register space is listed in section 8.26 . table 8-6. register summary offset alias offset abbreviation name rw general 0x0000 0x00004 ctrl device control register rw 0x0008 n/a status device status register ro 0x0018 n/a ctrl_ext extended device control register rw 0x0020 n/a mdic mdi control register rw 0x0024 n/a serdesctl serdes_ana rw 0x0028 n/a fcal flow control address low ro 0x002c n/a fcah flow control address high ro 0x0030 n/a fct flow control type rw 0x0034 n/a connsw copper/fiber switch control rw 0x0038 n/a vet vlan ether type rw 0x0170 n/a fcttv flow control transmit timer value rw 0x0e00 n/a ledctl led control register rw 0x1028 n/a i2ccmd sfp i2c command rw 0x102c n/a i2cparams sfp i2c parameter rw 0x1040 n/a wdstp watchdog setup register rw 0x1044 n/a wdswsts watchdog software rw 0x1048 n/a frtimer free running timer rws 0x104c n/a tcptimer tcp timer rw 0x5b70 n/a dca_id dca requester id information register ro 0x05b50 n/a swsm software semaphore register rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 448 0x05b54 n/a fwsm firmware semaphore register rws 0x5b5c n/a sw_fw_sync software-firmware synchronization rws flash/eeprom registers 0x0010 n/a eec eeprom/flash control register rw 0x0014 n/a eerd eeprom read register rw 0x001c n/a fla flash access register rw 0x1010 n/a eemngctl mng eeprom control register ro 0x1014 n/a eemngdata mng eeprom read/write data ro 0x1018 n/a flmngctl mng flash control register ro 0x101c n/a flmngdata mng flash read data ro 0x1020 n/a flmngcnt mng flash read counter ro 0x1024 n/a eearbc eeprom auto read bus control rw 0x103c n/a flashop flash opcode register rw 0x1038 n/a eediag eeprom diagnostic ro 0x1060 n/a vpddiag vpd diagnostic ro interrupts 0x01500 0x000c0 icr interrupt cause read rc/w1c 0x01504 0x000c8 ics interrupt cause set wo 0x01508 0x000d0 ims interrupt mask set/read rw 0x0150c 0x000d8 imc interrupt mask clear wo 0x01510 0x000e0 iam interrupt acknowledge auto mask rw 0x1520 n/a eics extended interrupt cause set wo 0x1524 n/a eims extended interrupt mask set/read rws 0x1528 n/a eimc extended interrupt mask clear wo 0x152c n/a eiac extended interrupt auto clear rw 0x1530 n/a eiam extended interrupt auto mask rw 0x1580 n/a eicr extended interrupt cause read rc/w1c 0x1700 - 0x171c n/a ivar interrupt vector allocation registers rw 0x1740 n/a ivar_misc interrupt vector allocation registers - misc rw 0x1680 - 0x16f0 n/a eitr extended interrupt throttling rate 0 - 24 rw table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 449 0x1514 n/a gpie general purpose interrupt enable rw 0x5b68 n/a pbacl msi-x pba clear r/w1c receive 0x00100 n/a rctl rx control rw 0x02160 0x00168 fcrtl0 flow control receive threshold low rw 0x02170 n/a fcrtl1 flow control receive threshold low rw 0x02458 n/a pbdiag pb diagnostic rw 0x02404 n/a rxpbs rx packet buffer size rw 0x24e8 n/a pbrwac rx packet buffer wrap around counter ro 0x02460 n/a fcrtv flow control refresh timer value rw 0x2540 n/a drxmxod dma rx max total allow size requests rw 0xc000 0x00110, 0x02800 rdbal[0] rx descriptor base low queue 0 rw 0xc004 0x00114, 0x02804 rdbah[0] rx descriptor base high queue 0 rw 0xc008 0x00118, 0x02808 rdlen[0] rx descriptor ring length queue 0 rw 0xc00c 0x280c srrctl[0] split and replication receive control register queue 0 rw 0xc010 0x00120, 0x02810 rdh[0] rx descriptor head queue 0 ro 0xc018 0x00128, 0x2818 rdt[0] rx descriptor tail queue 0 rw 0xc028 0x02828 rxdctl[0] receive descriptor control queue 0 rw 0xc014 0x2814 rxctl[0] rx queue 0 dca ctrl register rw 0xc030 0x2830 rqdpc[0] rx queue drop packet count register 0 rc 0xc040 + 0x40 * (n-1) 0x2900+ 0x100 * (n-1) rdbal[1 - 3] rx descriptor base low queue 1 - 3 rw 0xc044 + 0x40 * (n-1) 0x2904 + 0x100 * (n-1) rdbah[1 - 3] rx descriptor base high queue 1 - 3 rw 0xc048 + 0x40 * (n-1) 0x2908 + 0x100 * (n-1) rdlen[1 - 3] rx descriptor ring length queue 1 - 3 rw 0xc04c + 0x40 * (n-1) 0x290c + 0x100 * (n-1) srrctl[1 - 3] split and replication receive control register queue 1 - 3 rw 0xc050 + 0x40 * (n-1) 0x2910 + 0x100 * (n-1) rdh[1 - 3] rx descriptor head queue 1 - 3 ro table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 450 0xc058 + 0x40 * (n-1) 0x2918 + 0x100 * (n-1) rdt[1 - 3] rx descriptor tail queue 1 - 3 rw 0xc068 + 0x40 * (n-1) 0x2928 + 0x100 * (n-1) rxdctl[1 - 3] receive descriptor control queue 1 - 3 rw 0xc054 + 0x40 * (n-1) 0x2914 + 0x100 * (n-1) rxctl[1 - 3] rx queue 1 - 3 dca ctrl register rw 0xc070 + 0x40 * (n-1) 0x2930 + 0x100 * (n-1) rqdpc[1 - 3] rx queue drop packet count register 1 - 3 rc 0xc100 + 0x40 * (n- 4) n/a rdbal[4-15] rx descriptor base low queue 4 - 15 rw 0xc104 + 0x40 * (n- 4) n/a rdbah[4-15] rx descriptor base high queue 4 - 15 rw 0xc108 + 0x40 * (n- 4) n/a rdlen[4-15] rx descriptor ring length queue 4 - 15 rw 0xc10c + 0x40 * (n- 4) n/a srrctl[4 -15] split and replication receive control register queue 4 - 15 rw 0xc110 + 0x40 * (n- 4) n/a rdh[4 - 15] rx descriptor head queue 4 - 15 ro 0xc118 + 0x40 * (n- 4) n/a rdt[4 - 15] rx descriptor tail queue 4 - 15 rw 0xc128 + 0x40 * (n- 4) n/a rxdctl[4 - 15] receive descriptor control queue 4 - 15 rw 0xc114 + 0x40 * (n- 4) n/a rxctl[4 - 15] rx queue 4 - 15 dca ctrl register rw 0xc130 + 0x40 * (n- 4) n/a rqdpc[4 - 15] rx queue drop packet count register 8 -15 rc 0x05000 n/a rxcsum receive checksum control rw 0x05004 n/a rlpml receive long packet maximal length rw 0x05008 n/a rfctl receive filter control register rw 0x05200- 0x053fc 0x00200- 0x003fc mta[127:0] multicast table array (n) rw 0x05400 + 8*n 0x00040 + 8*n ral[0-15] receive address low (15:0) rw 0x05404 + 8 *n 0x00044 + 8 *n rah[0-15] receive address high (15:0) rw 0x054e0 + 8*n n/a ral[16-23] receive address low (23:16) rw 0x054e4 + 8 *n n/a rah[16-23] receive address high (23:16) rw table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 451 0x5480 ? 0x549c n/a psrtype[ 7 :0] packet split receive type (n) rw 0x54c0 n/a rplpsrtype replicated packet split receive type rw 0x581c n/a vt_ctl next generation vmdq control register rw 0x05600- 0x057fc 0x00600- 0x007fc vfta[127:0] vlan filter table array (n) rw 0x05818 n/a mrqc multiple receive queues command rw 0x05c00- 0x05c7c n/a reta redirection table rw 0x05c80- 0x05ca4 n/a rssrk rss random key register rw 0x34e8 n/a pbtwac tx packet buffer wrap around counter ro transmit 0x03404 n/a txpbs tx packet buffer size rw 0x00400 n/a tctl tx control rw 0x00404 n/a tctl_ext tx control extended rw 0x00410 n/a tipg tx ipg rw 0x3590 n/a dtxctl dma tx control rw 0x359c n/a dtxtcpflgl dma tx tcp flags control low rw 0x35a0 n/a dtxtcpflgh dma tx tcp flags control high rw 0x3540 n/a dtxmxszrq dma tx max total allow size requests rw 0xe000 0x00420, 0x03800 tdbal[0] tx descriptor base low 0 rw 0xe004 0x00424, 0x03804 tdbah[0] tx descriptor base high 0 rw 0xe008 0x00428, 0x03808 tdlen[0] tx descriptor ring length 0 rw 0xe010 0x00430, 0x03810 tdh[0] tx descriptor head 0 ro 0xe018 0x00438, 0x03818 tdt[0] tx descriptor tail 0 rw 0xe028 0x03828 txdctl[0] transmit descriptor control queue 0 rw 0xe014 0x3814 txctl[0] tx dca ctrl register queue 0 rw 0xe038 0x3838 tdwbal[0] transmit descriptor wb address low queue 0 rw 0xe03c 0x383c tdwbah[0] transmit descriptor wb address high queue 0 rw table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 452 0x0e040 + 0x40 * (n-1) 0x03900 + 0x100 * (n-1) tdbal[1-3] tx descriptor base low queue 1 - 3 rw 0x0e044 + 0x40 * (n-1) 0x03904 + 0x100 * (n-1) tdbah[1-3] tx descriptor base high queue 1 - 3 rw 0x0e048 + 0x40 * (n-1) 0x03908 + 0x100 * (n-1) tdlen[1-3] tx descriptor ring length queue 1 - 3 rw 0x0e050 + 0x40 * (n-1) 0x03910 + 0x100 * (n-1) tdh[1-3] tx descriptor head queue 1 - 3 ro 0x0e058 + 0x40 * (n-1) 0x03918 + 0x100 * (n-1) tdt[1-3] tx descriptor tail queue 1 - 3 rw 0x0e068 + 0x40 * (n-1) 0x03928 + 0x100 * (n-1) txdctl[1-3] transmit descriptor control 1 - 3 rw 0x0e054 + 0x40 * (n-1) 0x3914 + 0x100 * (n-1) txctl[1-3] tx dca ctrl register queue 1 - 3 rw 0x0e078 + 0x40 * (n-1) 0x3938 + 0x100 * (n-1) tdwbal[1-3] transmit descriptor wb address low queue 1 - 3 rw 0x0e07c + 0x40 * (n-1) 0x393c + 0x100 * (n-1) tdwbah[1-3] transmit descriptor wb address high queue 1 - 3 rw 0x0e180 + 0x40 *n n/a tdbal[4 - 15] tx descriptor base low queue 4 - 15 rw 0x0e184 + 0x40 * n n/a tdbah[4 - 15] tx descriptor base high queue 4 - 15 rw 0x0e188 + 0x40 * n n/a tdlen[4 - 15] tx descriptor ring length queue 4 - 15 rw 0x0e190 + 0x40 * n n/a tdh[4 - 15] tx descriptor head queue 4 - 15 ro 0x0e198 + 0x40 * n n/a tdt[4 - 15] tx descriptor tail queue 4 - 15 rw 0x0e1a8 + 0x40 * n n/a txdctl[4 - 15] transmit descriptor control 4 - 15 rw 0x0e194 + 0x40 * n n/a txctl[4 - 15] tx queue 4 - 15 dca ctrl register rw 0x0e1b8 + 0x40 * n n/a tdwbal[4 - 15] transmit descriptor wb address low queue 4 - 15 rw 0x0e1bc + 0x40 * n n/a tdwbah[4 - 15] transmit descriptor wb address high queue 4 - 15 rw filters table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 453 0x5cb0 - 0x5ccc n/a etqf[0 - 7] etype queue filter 0 - 7 rw 0x05a80 - 0x05a9c n/a imir[0 - 7] immediate interrupt rx [7:0] rw 0x5aa0 - 0x5abc n/a imirext[0 - 7] immediate interrupt rx extended[0-7] rw 0x05ac0 n/a imirvp immediate interrupt rx vlan priority rw 0x5980 - 0x599c n/a saqf[0 - 7] source address queue filter 0 - 7 rw 0x59a0 - 0x59bc n/a daqf[0 - 7] destination address queue filter 0 - 7 rw 0x59c0 - 0x59dc n/a spqf[0 - 7] source port queue filter 0 - 7 rw 0x59e0 - 0x59fc n/a ftqf[0 - 7] five-tuple queue filter 0 - 7 rw 0x55fc n/a synqf syn packet queue filter rw virtualization 0x03004 n/a swpbs switch packet buffer size rw 0x030e8 n/a pbswac switch packet buffer wrap around counter ro 0x00c40 - 0x00c5c n/a vfmailbox[0 - 7] vf mailbox register rw 0x00c00 - 0x00c1c n/a pfmailbox[0 - 7] pf mailbox register rw 0x00800 - 0x009fc n/a vmbmem virtual machines mailbox memory rw 0x0c80 n/a mbvficr mailbox vf interrupt causes r/w1c 0x0c84 n/a mbvfimr mailbox vf interrupt mask rw 0x0c88 n/a vflre vflr events r/w1c 0x0c8c n/a vfre vf receive enable rw 0x0c90 n/a vfte vf transmit enable rw 0x3554 n/a wvbr wrong vm behavior register rc 0x3510 n/a vmecm vm error count mask rw 0x3548 n/a lvmmc last vm misbehavior cause rc 0x3558 n/a mdfb malicious driver free block w1c 0x3550 n/a dtxmapt dma max arbitration time rw 0x02408 n/a qde queue drop enable register rw 0x3500 n/a dtxswc dma tx switch control rw table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 454 0x05d00 - 0x05d7c n/a vlvf vlan vm filter rw 0x05ad0 - 0x5aec n/a vmolr[0 - 7] vm offload register[0-7] rw 0x03700 n/a vmvir vm vlan insert register rw 0x05af0 n/a rplolr replication offload register rw 0x0a000 - 0x0a1fc n/a uta unicast table array wo 0x5d80 - 0x5d8c n/a vmrctl virtual mirror rule control rw 0x5d90 - 0x5d9c n/a vmrvlan virtual mirror rule vlan rw 0x5da0 - 0x5dac n/a vmrvm virtual mirror rule vm rw 0x5db0 n/a sccrl storm control control register rw 0x5db4 n/a scsts storm control status ro 0x5db8 n/a bsctrh broadcast storm control threshold rw 0x5dbc n/a msctrh multicast storm control threshold rw 0x5dc0 n/a bsccnt broadcast storm control current count ro 0x5dc4 n/a msccnt multicast storm control current count ro 0x5dc8 n/a sctc storm control time counter ro vf registers mirrors 0x10000 + vfn * 0x100 vtctrl vf control (only rst bit) rw 0x10020 + vfn * 0x100 vteics extended interrupt cause set register wo 0x10024 + vfn * 0x100 vteims extended interrupt mask set/read register rw 0x10028 + vfn * 0x100 vteimc extended interrupt mask clear register wo 0x1002c + vfn * 0x100 vteiac extended interrupt auto clear register rw 0x10030 + vfn * 0x100 vteiam extended interrupt auto mask enable register rw 0x10080 + vfn * 0x100 vteicr extended interrupt cause set register rc/w1c 0x10010 + vfn * 0x100 vfgprc good packets received count ro 0x10014 + vfn * 0x100 vfgptc good packets transmitted count ro table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 455 0x10018 + vfn * 0x100 vfgorc good octets received count ro 0x10034 + vfn * 0x100 vfgotc good octets transmitted count ro 0x1003c + vfn * 0x100 vfmprc multicast packets received count ro 0x10040 + vfn * 0x100 vfgprlbc good rx packets loopback count ro 0x10044 + vfn * 0x100 vfgptlbc good tx packets loopback count ro 0x10048 + vfn * 0x100 vfgorlbc good rx octets loopback count ro 0x10050 + vfn * 0x100 vfgotlbc good tx octets loopback count ro 0x3600 n/a trldcs transmit rate-limiter descriptor plane control & status rw 0x3690 - 0x3694 n/a transmit rate-er mmw rw 0x3604 n/a dqsel transmit descriptor plane queue select rw 0x36b0 n/a rc transmit rate-er config rw 0x36b4 n/a rs transmit rate-er status rw statistics 0x04000 n/a crcerrs crc error count rc 0x04004 n/a algnerrc alignment error count rc 0x04008 n/a symerrs symbol error count rc 0x0400c n/a rxerrc rx error count rc 0x04010 n/a mpc missed packets count rc 0x04014 n/a scc single collision count rc 0x04018 n/a ecol excessive collisions count rc 0x0401c n/a mcc multiple collision count rc 0x04020 n/a latecol late collisions count rc 0x04028 n/a colc collision count rc 0x0402c n/a cbtmpc circuit breaker tx manageability packet count rc 0x04030 n/a dc defer count rc 0x04034 n/a tncrs transmit - no crs rc 0x0403c n/a htdpmc host transmit discarded packets by mac count rc 0x04040 n/a rlec receive length error count rc table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 456 0x04044 n/a cbrdpc circuit breaker rx dropped packet rc 0x04048 n/a xonrxc xon received count rc 0x0404c n/a xontxc xon transmitted count rc 0x04050 n/a xoffrxc xoff received count rc 0x04054 n/a xofftxc xoff transmitted count rc 0x04058 n/a fcruc fc received unsupported count rc 0x0405c n/a prc64 packets received (64 bytes) count rc 0x04060 n/a prc127 packets received (65-127 bytes) count rc 0x04064 n/a prc255 packets received (128-255 bytes) count rc 0x04068 n/a prc511 packets received (256-511 bytes) count rc 0x0406c n/a prc1023 packets received (512-1023 bytes) count rc 0x04070 n/a prc1522 packets received (1024-1522 bytes) rc 0x04074 n/a gprc good packets received count rc 0x04078 n/a bprc broadcast packets received count rc 0x0407c n/a mprc multicast packets received count rc 0x04080 n/a gptc good packets transmitted count rc 0x04088 n/a gorcl good octets received count (lo) rc 0x0408c n/a gorch good octets received count (hi) rc 0x04090 n/a gotcl good octets transmitted count (lo) rc 0x04094 n/a gotch good octets transmitted count (hi) rc 0x040a0 n/a rnbc receive no buffers count rc 0x024e0 0x024e4 0x040a0 tcrnbc[1:0] tc receive no buffers count [1:0] rc 0x040a4 n/a ruc receive under size count rc 0x040a8 n/a rfc receive fragment count rc 0x040ac n/a roc receive oversize count rc 0x040b0 n/a rjc receive jabber count rc 0x040b4 n/a mngprc management packets receive count rc 0x040b8 n/a mpdc management packets dropped count rc 0x040bc n/a mngptc management packets transmitted count rc 0x413c n/a bmngprc bmc management packets receive count rc 0x4140 n/a bmpdc bmc management packets dropped count rc table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 457 0x4144 n/a bmngptc bmc management packets transmitted count rc 0x040c0 n/a torl total octets received (lo) rc 0x040c4 n/a torh total octets received (hi) rc 0x040c8 n/a totl total octets transmitted (lo) rc 0x040cc n/a toth total octets transmitted (hi) rc 0x040d0 n/a tpr total packets received rc 0x040d4 n/a tpt total packets transmitted rc 0x040d8 n/a ptc64 packets transmitted (64 bytes) count rc 0x040dc n/a ptc127 packets transmitted (65-127 bytes) count rc 0x040e0 n/a ptc255 packets transmitted (128-256 bytes) count rc 0x040e4 n/a ptc511 packets transmitted (256-511 bytes) count rc 0x040e8 n/a ptc1023 packets transmitted (512-1023 bytes) count rc 0x040ec n/a ptc1522 packets transmitted (1024-1522 bytes) count rc 0x040f0 n/a mptc multicast packets transmitted count rc 0x040f4 n/a bptc broadcast packets transmitted count rc 0x040f8 n/a tsctc tcp segmentation context transmitted count rc 0x040fc n/a cbrmpc circuit breaker rx manageability packet count ? rc 0x04100 n/a iac interrupt assertion count rc 0x04104 n/a rpthc rx packets to host count rc 0x04108 n/a dbgc1 debug counter 1 rc 0x0410c n/a dbgc2 debug counter 2 rc 0x04110 n/a dbgc3 debug counter 3 rc 0x0411c n/a dbgc4 debug counter 4 rc 0x04118 n/a hgptc host good packets transmitted count rc 0x04120 n/a rxdmtc rx descriptor minimum threshold count rc 0x04124 n/a htcbdpc host tx circuit breaker dropped packets count rc 0x04128 n/a hgorcl host good octets received count (lo) rc 0x0412c n/a hgorch host good octets received count (hi) rc 0x04130 n/a hgotcl host good octets transmitted count (lo) rc 0x04134 n/a hgotch host good octets transmitted count (hi) rc 0x4138 n/a lenerrs length errors count register rc table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 458 0x4228 n/a scvpc serdes/sgmii code violation packet count register rw 0x41a0 n/a ssvpc switch security violation packet count rc 0x41a4 n/a sdpc switch drop packet count rc 0x4300 n/a lsectxut macsec tx untagged packet counter rc 0x4304 n/a lsectxpkte macsec encrypted tx packets count rc 0x4308 n/a lsectxpktp macsec protected tx packets count rc 0x430c n/a lsectxocte macsec encrypted tx octets count rc 0x4310 n/a lsectxoctp macsec protected tx octets count rc 0x4314 n/a lsecrxut macsec untagged rx packet count rc 0x431c n/a lsecrxocte macsec rx octets decrypted count rc 0x4320 n/a lsecrxoctp macsec rx octets validated rc 0x4324 n/a lsecrxbad macsec rx packet with bad tag rc 0x4328 n/a lsecrxnosci macsec rx packet no sci count rc 0x432c n/a lsecrxunsci macsec rx packet unknown sci count rc 0x4330 n/a lsecrxunch macsec rx unchecked packets count rc 0x4340 n/a lsecrxdelay macsec rx delayed packets count rc 0x4350 n/a lsecrxlate macsec rx late packets count rc 0x4360 - 0x4364 n/a lsecrxok[n] macsec rx packet ok count rc 0x4380 - 0x4384 n/a lsecrxinv[n] macsec rx invalid count rc 0x43a0 - 0x43a4 n/a lsecrxnv[n] macsec rx not valid count rc 0x43c0 n/a lsecrxnusa macsec rx not using sa count rc table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 459 0x43d0 n/a lsecrxunsa macsec rx unused sa count rc wake up 0x05800 n/a wuc wake up control rw 0x05808 n/a wufc wake up filter control rw 0x05810 n/a wus wake up status r/w1c 0x05900 n/a wupl wake up packet length ro 0x05a00- 0x05a7c n/a wupm wake up packet memory ro 0x09000- 0x093fc n/a fhft flexible host filter table registers rw 0x09a00- 0x09bfc n/a fhft_ext flexible host filter table registers extended rw manageability 0x05010 - 0x0502c n/a mavtv[7:0] vlan tag value 7 - 0 rw 0x5030 - 0x504c n/a mfutp[7:0] management flex udp/tcp ports rw 0x05060 - 0x0506c n/a metf[3:0] management ethernet type filters rw 0x05820 n/a manc management control rw 0x05838 n/a ipav ip address valid rw 0x5824 n/a mfval manageability filters valid rw 0x05840- 0x05858 n/a ip4at ipv4 address table rw 0x05860 n/a manc2h management control to host register rw 0x05880- 0x0588f n/a ip6at ipv6 address table rw 0x5890 - 0x58ac n/a mdef[7:0] manageability decision filters rw 0x5930 ? 0x594c n/a mdef_ext[7:0] manageability decision filters rw 0x58b0 - 0x58ec n/a mipaf manageability ip address filter rw 0x5910 + 8*n n/a mmal[3:0] manageability mac address low 3:0 rw 0x5914 + 8*n n/a mmah[3:0] manageability mac address high 3:0 rw table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 460 0x09400- 0x097fc n/a ftft flexible tco filter table rw 0x08800- 0x08efc n/a flex mng flex manageability memory address space rw 0x08f14 n/a lswfw macsec software/firmware interface ro 0x08f14 n/a reserved reserved pcie 0x05b00 n/a gcr pcie control register rw 0x05b04 n/a rtiv replay timer initial value register rw 0x05b08 n/a functag function tag register rw 0x05b88 n/a ciaa config indirect access address rw 0x05b8c n/a ciad config indirect access data rw 0x5bbc n/a iovctl iov control rw 0x05b0c n/a ltiv latency timer initial value register rw 0x05b10 n/a gscl_1 pcie statistics control #1 rw 0x05b14 n/a gscl_2 pcie statistics control #2 rw 0x05b18 n/a gscl_3 pcie statistics control #3 rw 0x05b1c n/a gscl_4 pcie statistics control #4 rw 0x5b90 - 0x5b9c n/a gscl_lbt[3:0] pcie statistics control leaky bucket timer rw 0x05b20 n/a gscn_0 pcie counter register #0 rw 0x05b24 n/a gscn_1 pcie counter register #1 rw 0x05b28 n/a gscn_2 pcie counter register #2 rw 0x05b2c n/a gscn_3 pcie counter register #3 rw 0x05b30 n/a factps function active and power state rw 0x05b34 n/a gioanactl0 serdes/ccm/pcie csr rw 0x05b38 n/a gioanactl1 serdes/ccm/pcie csr rw 0x05b3c n/a gioanactl2 serdes/ccm/pcie csr rw 0x05b40 n/a gioanactl3 serdes/ccm/pcie csr rw 0x05b44 n/a gioanactlall serdes/ccm/pcie csr rw 0x05b48 n/a ccmctl serdes/ccm/pcie csr rw 0x05b4c n/a scctl serdes/ccm/pcie csr rw 0x05b64 n/a mrevid mirrored revision id ro table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 461 0x05b74 n/a dca_ctrl dca control register rw diagnostic 0x02410 0x08000 rdfh rx data fifo head rws 0x02418 0x08008 rdft rx data fifo tail rws 0x02420 n/a rdfhs rx data fifo head saved rws 0x02428 n/a rdfts rx data fifo tail saved rws 0x02430 n/a rdfpc receive data fifo packet count ro 0x3010 n/a swbfh switch buffer fifo head rws 0x3018 n/a swbft switch buffer fifo tail rws 0x3020 n/a swbfhs switch buffer fifo head saved rws 0x3028 n/a swbfts switch buffer fifo tail saved rws 0x03030 n/a swdfpc switch data fifo packet count ro 0x245c n/a rpbeccsts receive packet buffer ecc control rc 0x345c n/a tpbeccsts transmit packet buffer ecc control rc 0x305c n/a swpbeccsts switch packet buffer ecc control rc 0xb470 n/a ippbeccsts ipsec packet buffer ecc control rc 0x2464 n/a fcsts0 flow control status ro 0x25c0 n/a rdhests rx descriptor handler ecc status rc 0x35c0 n/a tdhests tx descriptor handler ecc status rc 0x05bb0 n/a pwbests pcie write buffer ecc status rc 0x05ba8 n/a pmsixests pcie msi-x ecc status rc 0x05ba0 n/a prbests pcie retry buffer ecc status rc 0x0b474 n/a ippbeei ipsec packet buffer ecc error inject rw 0x25fc n/a rdhmp rx descriptor handler memory page number rw 0x03410 0x08010 tdfh tx data fifo head rws 0x03418 0x08018 tdft tx data fifo tail rws 0x03420 n/a tdfhs tx data fifo head saved rws 0x03428 n/a tdfts tx data fifo tail saved rws 0x03430 n/a tdfpc transmit data fifo packet count ro 0x35fc n/a tdhmp tx descriptor handler memory page number rw 0x6000 - 0x6ffc n/a rdhm rx descriptors internal cache ro table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 462 0x7000 - 0x7ffc n/a tdhm tx descriptors internal cache ro 0x3100 n/a pbslac pb slave access control rw 0x3110 - 0x311c n/a pbslad pb slave access data rw 0x1084 n/a peind parity and ecc indication rc 0x1088 n/a peindm parity and ecc indication mask rw 0x5bb8 n/a functo function timeout ro 0xa000 n/a cbcr circuit breaker configuration rw 0xa010 n/a cbcs circuit breaker counter status r/w1c 0xa040 - 0xa0bc n/a cbctc counter/threshold configuration rw 0xa0c0 - 0xa0fc n/a cbctv counter/threshold value rw 0xa800 - 0xa878 n/a cbtc transmit filter configuration rw 0xa87c n/a cbtcd transmit filter configuration default rw 0xa880 n/a cbtfs circuit breaker transmit filter status r/w1c 0xa884 n/a cbtic circuit breaker transmit interrupt cause rc/w1c 0xa8c0 - 0xa8fc n/a cbtipv transmit filter ip address value rw 0xa940 - 0xa9bc n/a cbtipm transmit filter ip address mask rw 0xa9c0 - 0xa9fc n/a cbtptv transmit filter port / type value rw 0xaa40 - 0xaabc n/a cbtnhfv transmit filter ip next header/flags value rw 0xaac0 - 0xaafc n/a cbttfm transmit filter tcp flags mask rw 0xab40 - 0xabbc n/a cbtvln transmit filter vlan rw 0xac00 - 0xac78 n/a cbrc receive filter configuration rw 0xac7c n/a cbrcd receive filter configuration default rw 0xac80 n/a cbrfs circuit breaker receive filter status r/w1c 0xac84 n/a cbric circuit breaker receive interrupt cause rc/w1c 0xacc0 - 0xacfc n/a cbripv receive filter ip address value rw table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 463 0xad40 - 0xadbc n/a cbripm receive filter ip address mask rw 0xadc0 - 0xadfc n/a cbrptv receive filter port / type value rw 0xae40 - 0xaebc n/a cbrnhfv receive filter ip next header/flags value rw 0xaec0 - 0xaefc n/a cbrtfm receive filter tcp flags mask rw 0xaf40 - 0xafbc n/a cbrvln receive filter vlan rw 0x00f00 n/a circ circuits control ro 0x35e0 n/a txbdc tx dma performance burst and descriptor count rc 0x35e4 n/a txidle tx dma performance idle count rc 0x25e0 n/a rxbdc tx dma performance burst and descriptor count rc 0x25e4 n/a rxidle tx dma performance idle count rc 0x05b7c n/a ult1 ult1 register ro 0x05b80 n/a ult2 ult2 register ro 0x05b84 n/a strap strap register ro pcs 0x4200 n/a pcs_cfg pcs configuration 0 register rw 0x4208 n/a pcs_lctl pcs link control register rw 0x420c n/a pcs_lsts pcs link status register ro 0x4210 n/a pcs_dbg0 pcs debug 0 register ro 0x4214 n/a pcs_dbg1 pcs debug 1 register ro 0x4218 n/a pcs_anadv an advertisement register rw 0x421c n/a pcs_lpab link partner ability register ro 0x4220 n/a pcs_nptx an next page transmit register rw 0x4224 n/a pcs_lpabnp link partner ability next page register ro time sync 0x0b620 n/a tsyncrxctl rx time sync control register rw 0x0b624 n/a rxstmpl rx timestamp low ro 0x0b628 n/a rxstmph rx timestamp high ro 0x0b62c n/a rxsatrl rx timestamp attributes low ro 0x0b630 n/a rxsatrh rx timestamp attributes low ro table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 464 0x0b614 n/a tsynctxctl tx time sync control register rw 0x0b618 n/a txstmpl tx timestamp value low ro 0x0b61c n/a txstmph tx timestamp value high ro 0x0b600 n/a systiml system time register low rws 0x0b604 n/a systimh system time register high rws 0x0b608 n/a timinca increment attributes register rw 0x0b60c n/a timadjl time adjustment offset register low rw 0x0b610 n/a timadjh time adjustment offset register high rw 0x0b640 n/a tsauxc auxiliary control register rw 0x0b644 n/a trgttiml0 target time register 0 low rw 0x0b648 n/a trgttimh0 target time register 0 high rw 0x0b64c n/a trgttiml1 target time register 1 low rw 0x0b650 n/a trgttimh1 target time register 1 high rw 0x0b65c n/a auxstmpl0 auxiliary time stamp 0 register low ro 0x0b660 n/a auxstmph0 auxiliary time stamp 0 register high ro 0x0b664 n/a auxstmpl1 auxiliary time stamp 1 register low ro 0x0b668 n/a auxstmph1 auxiliary time stamp 1 register high ro 0x05f50 n/a tsyncrxcfg time sync rx configuration rw 0x0003c n/a tssdp time sync sdp config reg rw macsec 0xb000 n/a lsectxcap macsec tx capabilities register ro 0xb300 n/a lsecrxcap macsec rx capabilities register ro 0xb004 n/a lsectxctrl macsec tx control register rw 0xb304 n/a lsecrxctrl macsec rx control register rw 0xb008 n/a lsectxscl macsec tx sci low rw 0xb00c n/a lsectxsch macsec tx sci high rw 0xb010 n/a lsectxsa macsec tx sa rw 0xb018 n/a lsectxpn0 macsec tx sa pn 0 rw 0xb01c n/a lsectxpn1 macsec tx sa pn 1 rw 0xb020 - 0xb02c n/a lsectxkey0 macsec tx key 0 wo table 8-6. register summary (continued) offset alias offset abbreviation name rw
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 465 0xb030 - 0xb03c n/a lsectxkey1 macsec tx key 1 wo 0xb3d0 n/a lsecrxscl[n] macsec rx sci low rw 0xb3e0 n/a lsecrxsch[n] macsec rx sci high rw 0xb310 - 0xb314 n/a lsecrxsa macsec rx sa rw 0xb330 - 0xb334 n/a lsecrxsapn macsec rx sa pn rw 0xb350 - 0xb36c n/a lsecrxkey[n, m] macsec rx key wo ipsec 0xb408 n/a ipsrxcmd ipsec rx command register rw 0xb400 n/a ipsrxidx ipsec rx index rw 0xb420 - 0xb42c n/a ipsrxipaddr ipsec rx ip address register rw 0xb410 - 0xb41c n/a ipsrxkey ipsec rx key register rw 0xb404 n/a ipsrxsalt ipsec rx salt register rw 0xb40c n/a ipsrxspi ipsec rx spi register rw 0xb450 n/a ipstxidx ipsec tx index rw 0xb460 - 0xb46c n/a ipstxkey ipsec tx key registers rw 0xb454 n/a ipstxsalt ipsec tx salt register rw 0xb430 n/a ipsctrl ipsec control rw 0x05f40 n/a ctsrxctl cts rx control rw 0xb100 n/a ctstxctl cts tx control rw 0xb104 n/a ctstxh0 cts tx header 0 rw 0xb108 n/a ctstxh1 cts tx header 1 rw 5e00 + 4*n (n=0...31) n/a ctsrxt cts rx tags rw 5f60 + 4*n (n=0...3) n/a ctsrxmngt cts rx mng tags rw 0x3700 + 4*n (n=0...63) n/a ctstxt cts tx tags rw table 8-6. register summary (continued) offset alias offset abbreviation name rw
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 466 certain registers maintain an alias address designed for backward compatibility with software written for previous gbe controllers. for these registers, the alias address is shown in the table above. those registers can be accessed by software at either the new offset or the alias offset. it is recommended that software that is written solely for the 82576, use the new address offset. 8.1.4 msi-x bar register summary 8.2 general register descriptions 8.2.1 device control register - ctrl (0x00000; r/w) this register, as well as the extended device control register (ctrl_ext), controls the major operational modes for the device. while software write to this register to control device settings, several bits (such as fd and speed) can be overridden depending on other bit settings and the resultant link configuration determined by the phy's auto-negotiation resolution. see section 4.5.7 for details on the setup of these registers in the different link modes. note: this register is also aliased at address 0x0004. table 8-7. msi-x register summary category offset abbreviation name rw page msi-x table 0x0000 + n*0x10 [n=0...24] msixtadd msi?x table entry lower address rw page 513 msi-x table 0x0004 + n*0x10 [n=0..24] msixtuadd msi?x table entry upper address rw page 514 msi-x table 0x0008 + n*0x10 [n=0...24] msixtmsg msi?x table entry message r/w page 514 msi-x table 0x000c + n*0x10 [n=0...24] msixtvctrl msi?x table entry vector control r/w page 514 msi-x table 0x02000 msixpba msixpba bit description ro page 514 field bit(s) initial value description fd 0 1b 1 full-duplex controls the mac duplex setting when explicitly set by software. 0b = half duplex. 1b = full duplex. reserved 1 0b this bit is reserved and should be set to 0b for future compatibility. gio master disable 2 0b when set to 1b, the function of this bit blocks new master requests including manageability requests. if no master requests are pending by this function, the gio master enable status bit is set.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 467 link reset 3 1b 1 link reset. 0 = normal; 1 = reset. used to reset/restart the link auto-negotiation process when using serdes mode. reserved 4 0b reserved. reserved 5 0b 1 reserved. must be set to 0b. - was asde slu 6 0b 1 set link up. when the mac link mode is set for gmii/mii mode (internal phy), set link up must be set to 1 to permit the mac to recognize the link signal from the phy, which indicates the phy has gotten the link up, and is ready to receive and transmit data. see section 3.5.4 for more information about auto- negotiation and link configuration in the various modes. the ?set link up? is normally initialized to 0. however, if the apm enable bit is set in the eeprom then it is initialized to 1b. ilos 7 0b 1 invert loss-of-signal (los/link) signal. 0b = do not invert (active high input signal). 1b = invert signal (active low input signal). should be set to zero when using internal phy. speed 9:8 10b speed selection. these bits determine the speed configuration and are written by software after reading the phy configuration through the mdio interface. these signals are ignored when auto-speed detection is enabled. 00b = 10 mb/s. 01b = 100 mb/s. 10b = 1000 mb/s. 11b = not used. reserved 10 0b reserved. write as 0b to ensure future compatibility. frcspd 11 0b 1 force speed. this bit is set when software needs to manually configure the mac speed settings according to the speed bits. when using a phy, note that it must resolve to the same speed configuration or software must manually set it to the same speed as the mac. the default is asserted. software must clear this bit to enable the phy or asd function to control the mac speed setting. note that this bit is superseded by the ctrl_ext.spd_byps bit which has a similar function. frcdplx 12 0b force duplex. when set to 1b, software can override the duplex indication from the phy that is indicated in the fdx to the mac. otherwise, in 10/100/1000base-t link mode, the duplex setting is sampled from the phy fdx indication into the mac on the asserting edge of the phy link signal. when asserted, the ctrl.fd bit sets duplex. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 468 reserved 15:13 0b reserved. reads as 0b. sdp0_gpien 16 0b general purpose interrupt detection enable for sdp0. if software-controlled io pin sdp0 is configured as an input, this bit (when 1b) enables the use for gpi interrupt detection. sdp1_gpien 17 0b general purpose interrupt detection enable for sdp1. if software-controlled io pin sdp1 is configured as an input, this bit (when 1b) enables the use for gpi interrupt detection. sdp0_data (rws) 18 0b 1 sdp0 data value. used to read or write the value of software-controlled io pin sdp0. if sdp0 is configured as an output (sdp0_iodir = 1b), this bit controls the value driven on the pin (initial value eeprom-configurable). if sdp0 is configured as an input, reads return the current value of the pin. when the sdp0_wde bit is set, this field indicates the polarity of the watchdog indication. sdp1_data (rws) 19 0b 1 sdp1 data value. used to read or write the value of software-controlled io pin sdp1. if sdp1 is configured as an output (sdp1_iodir = 1b), this bit controls the value driven on the pin (initial value eeprom-configurable). if sdp1 is configured as an input, reads return the current value of the pin. advd3wuc 20 1b 1 d3cold wake up capability advertisement enable. when set, d3cold wake up capability is advertised based on whether aux_pwr advertises presence of auxiliary power (yes if aux_pwr is indicated, no otherwise). when 0b, however, d3cold wake up capability is not advertised even if aux_pwr presence is indicated. note that the initial value is eeprom configurable. if full 1gb/sec. operation in d3 state is desired but the system's power requirements in this mode would exceed the d3cold wake up-enabled specification limit (375ma at 3.3v), this bit can be used to prevent the capability from being advertised to the system. sdp0_wde 21 0b 1 sdp0 used for watchdog indication when set, sdp0 is used as a watchdog indication. when set, the sdp0_data bit indicates the polarity of the watchdog indication. in this mode, sdp0_iodir must be set to an output. sdp0_iodir 22 0b 1 sdp0 pin direction. controls whether software-controllable pin sdp0 is configured as an input or output (0b = input, 1b = output). initial value is eeprom-configurable. this bit is not affected by software or system reset, only by initial power-on or direct software writes. sdp1_iodir 23 0b 1 sdp1 pin direction. controls whether software-controllable pin sdp1 is configured as an input or output (0b = input, 1b = output). initial value is eeprom-configurable. this bit is not affected by software or system reset, only by initial power-on or direct software writes. reserved 25:24 0b 1 reserved. formerly used as sdp3 and sdp2 pin input/output direction control, respectively. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 469 rst 26 0b device reset this bit performs a reset of the entire controller device, resulting in a state nearly approximating the state following a power-up reset or internal pcie reset, except for system pci configuration. 0b = normal. 1b = reset. this bit is self clearing and is referred to as software reset or global reset. rfce 27 0b receive flow control enable. when set, indicates that the the 82576 responds to the reception of flow control packets. if auto-negotiation is enabled, this bit should be set to the negotiated flow control value. is serdes mode the resolution is done by the hardware. in internal phy or sgmii modes it should be done by the software. tfce 28 0b transmit flow control enable. when set, indicates that the the 82576 transmits flow control packets (xon and xoff frames) based on the receiver fullness. if auto-negotiation is enabled, this bit should be set to the negotiated flow control value. is serdes mode the resolution is done by the hardware. in internal phy or sgmii modes it should be done by the software. reserved 29 0b reserved. vme 30 0b vlan mode enable. when set to 1b, vlan information is stripped from all received 802.1q packets. phy_rst 31 0b phy reset. controls a hardware-level reset to the internal phy. 0b = normal operation. 1b = phy reset asserted. 1. if the signature bits of the eeprom?s initialization control word 1 match (01b), these bits are read from the eeprom. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 470 8.2.2 device status register - status (0x00008; r) field bit(s) initial value description fd 0 x full duplex. 0 = half duplex 1= full duplex reflects duplex setting of the mac and/or link. fd reflects the actual mac duplex configuration. this normally reflects the duplex setting for the entire link, as it normally reflects the duplex configuration negotiated between the phy and link partner (copper link) or mac and link partner (fiber link). lu 1 x link up. 0 = no link established 1 = link established for this bit to be valid, the set link up bit of the device control register (ctrl.slu) must be set. link up provides a useful indication of whether something is attached to the port. successful negotiation of features/link parameters results in link activity. the link startup process (and consequently the duration for this activity after reset) can be several 100's of ms. when the internal phy is used, this reflects whether the phy's link indication is present. when the serdes or sgmii interface is used, this indicates loss-of-signal; if auto-negotiation is also enabled, this can also indicate successful auto-negotiation. refer to section 3.5.4 for more details. lan id 3:2 0b lan id. provides software a mechanism to determine the lan identifier for the mac. 00b = lan 0. 01b = lan 1. txoff 4 x transmission paused. this bit indicates the state of the transmit function when symmetrical flow control has been enabled and negotiated with the link partner. this bit is set to 1b when transmission is paused due to the reception of an xoff frame. it is cleared (0b) upon expiration of the pause timer or the receipt of an xon frame. reserved 5 x reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 471 speed 7:6 x link speed setting. reflects the speed setting of the mac and/or link when it is operating in 10/100/1000base-t mode (internal phy). when the mac is operating in 10/100/1000base-t mode with the internal phy, these bits normally reflect the speed of the actual link, negotiated by the phy and link partner and reflected internally from the phy to the mac (spd_ind). these bits also might represent the speed configuration of the mac only, if the mac speed setting has been forced via software (ctrl.speed) or if mac auto-speed detection is used. if auto-speed detection is enabled, the 82576's speed is configured only once after the link signal is asserted by the phy. 00b = 10 mb/s. 01b = 100 mb/s. 10b = 1000 mb/s. 11b = 1000 mb/s. asdv 9:8 x auto-speed detection value. speed result sensed by the 82576?s mac auto-detection function. these bits are provided for diagnostics purposes only. the asd calculation can be initiated by software writing a logic 1b to the ctrl_ext.asdchk bit. the resultant speed detection is reflected in these bits. phyra 10 1b phy reset asserted. this read/write bit is set by hardware following the assertion of a phy reset; it is cleared by writing a 0b to it. this bit is also used by firmware indicating a required initialization of the 82576?s phy. reserved 13:11 0x0 reserved. num vfs 17:14 0x0 reflects the value of the num vfs in the iov capability structure. iov mode 18 0b reflects the value of the vf enable (vfe) bit in the iov capability structure. gio master enable status 19 1b cleared by the 82576 when the gio master disable bit is set and no master requests are pending by this function. set otherwise. indicates that no master requests are issued by this function as long as the gio master disable bit is set. reserved 30:20 0x0 reserved. dma clock gating enable 31 1b 1 dma clock gating enable bit loaded from the eeprom- indicates the device support gating of the dma clock. 1. if the signature bits of the eeprom?s initialization control word 1 match (01b), this bit is read from the eeprom. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 472 8.2.3 extended device control register - ctrl_ext (0x00018; r/ w) field bit(s) initial value description reserved 1:0 0b reserved. should be written as 0b to ensure future compatibility. sdp2_gpien 2 0b general purpose interrupt detection enable for sdp2. if software-controllable io pin sdp2 is configured as an input, this bit (when set to 1b) enables use for gpi interrupt detection. sdp3_gpien 3 0b general purpose interrupt detection enable for sdp3. if software-controllable io pin sdp3 is configured as an input, this bit (when set to 1b) enables use for gpi interrupt detection. reserved 5:4 00b reserved. reads as 00b. sdp2_data 6 0b 1 sdp2 data value. used to read (write) the value of software-controllable io pin sdp2. if sdp2 is configured as an output (sdp2_iodir = 1b), this bit controls the value driven on the pin (initial value eeprom- configurable). if sdp2 is configured as an input, reads return the current value of the pin. sdp3_data 7 0b 1 sdp3 data value. used to read (write) the value of software-controllable io pin sdp3. if sdp3 is configured as an output (sdp3_iodir = 1b), this bit controls the value driven on the pin (initial value eeprom- configurable). if sdp3 is configured as an input, reads return the current value of the pin. reserved 9:8 0b 1 reserved formally used as sdp5 and sdp4 pin input/output direction control, respectively. sdp2_iodir 10 0b 1 sdp2 pin direction. controls whether software-controllable pin sdp2 is configured as an input or output (0b = input, 1b = output). initial value is eeprom-configurable. this bit is not affected by software or system reset, only by initial power-on or direct software writes. sdp3_iodir 11 0b 1 sdp3 pin direction. controls whether software-controllable pin sdp3 is configured as an input or output (0b = input, 1b = output). initial value is eeprom-configurable. this bit is not affected by software or system reset, only by initial power-on or direct software writes. asdchk 12 0b asd check. initiates an auto-speed-detection (asd) sequence to sense the frequency of the phy receive clock (rx_clk). the results are reflected in status.asdv. this bit is self-clearing.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 473 ee_rst 13 0b eeprom reset. when set, initiates a reset-like event to the eeprom function. this causes the eeprom to be read as if a rst# assertion had occurred. all the 82576 functions should be disabled prior to setting this bit. this bit is self-clearing. pfrstd (sc) 14 0b pf reset done. when set, the rsti bit in all the vfmailbox regs is cleared and the rstd bit in all the vfmailbox regs is set. spd_byps 15 0b speed select bypass. when set to 1b, all speed detection mechanisms are bypassed, and the 82576 is immediately set to the speed indicated by ctrl.speed. this provides a method for software to have full control of the speed settings of the 82576 and when the change takes place, by overriding the hardware clock switching circuitry. ns_dis 16 0 no snoop disable. when set to 1b, the 82576 does not set the no snoop attribute in any pcie packet, independent of pcie configuration and the setting of individual no snoop enable bits. when set to 0b, behavior of no snoop is determined by pcie configuration and the setting of individual no snoop enable bits. ro_dis 17 0b relaxed ordering disabled. when set to 1b, the 82576 does not request any relaxed ordering transactions regardless of the state of bit 4 in the pcie device control register (offset oxa8). when this bit is cleared and bit 4 of the pcie device control register is set, the 82576 requests relaxed ordering transactions as provided by registers rxctl and txctl (per queue and per flow). serdes low power enable 18 0b 1 when set, allows the serdes to enter a low power state when the function is in dr state as described in section 5.5.4 . l1 enable 19 0b 1 when set, enables l1 indication. phy power down enable 20 1b 1 when set, enables the phy to enter a low-power state as described in section 5.4.3 . reserved 21 0b reserved. should be set to 0b. link_mode 23:22 0b 1 link mode. this controls which interface is used to talk to the link. 00b = direct copper (1000base-t) interface (10/100/ 1000base-t internal phy mode). 01b = reserved. 10b = sgmii. 11b = internal serdes interface. reserved 24 0b reserved. i2c enabled 25 0b 1 enable i2c. this bit enables the i 2 c bus that can be used to access sfp modules in the eeprom. if cleared, the i 2 c pads are isolated and accesses through i2ccmd are ignored. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 474 this register provides extended control of the 82576?s functionality beyond that provided by the device control register (ctrl). the 82576 allows up to four externally controlled interrupts. all software-definable pins, these can be mapped for use as gpi interrupt bits. mappings are enabled by the sdpx_gpien bits only when these signals are also configured as inputs via sdpx_iodir. when configured to function as external interrupt pins, a gpi interrupt is generated when the corresponding pin is sampled in an active-high state. the bit mappings are shown in the table bellow for clarity. note: if software uses the ee_rst function and desires to retain current configuration information, the contents of the control registers should be read and stored by software. control register values are changed by a read of the eeprom which occurs upon assertion of the ee_rst bit. extended vlan 26 0b 1 extended vlan. when set, all incoming rx packets are expected to have at least one vlan with the ether type as defined in vet.ext_vet that should be ignored. the packets can have a second vlan that should be used for all filtering purposes. all tx packets are expected to have at least one vlan added to them by the host. in the case of an additional vlan request (vle) the second vlan is added after the vlan is added by the host. this bit is reset only by a power up reset or by an eeprom full auto load and should only be changed while tx and rx processes are stopped. rs_rt_en 27 0b 0 = rate scheduler are not reset at link speed change. 1 = rate scheduler are reset at link speed change. drv_load 28 0b driver loaded. this bit should be set by the driver after it is loaded. this bit should be cleared when the driver unloads or after a pcie soft reset. the mng controller loads this bit to indicate to the manageability controller that the driver has loaded. reserved 29 0b reserved reserved 31:30 0b reserved. 1. these bits are read from the eeprom. table 8-8. mappings for sdi pins used as gpi sdp pin used as gpi ctrl_ext field settings resulting icr bit (gpi) direction enable as gpi interrupt 3 sdp3_iodir sdp3_gpien 14 2 sdp2_iodir sdp2_gpien 13 1 sdp1_iodir sdp1_gpien 12 0 sdp0_iodir sdp0_gpien 11 field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 475 the eeprom reset function can read configuration information out of the eeprom which affects the configuration of pcie space bar settings. the changes to the bars are not visible unless the system reboots and the bios are allowed to re-map them. the spd_byps bit performs a similar function to the ctrl.frcspd bit in that the 82576?s speed settings are determined by the value software writes to the crtl.speed bits. however, with the spd_byps bit asserted, the settings in ctrl.speed take effect rather than waiting until after the 82576?s clock switching circuitry performs the change. 8.2.4 mdi control register - mdic (0x00020; r/w) software uses this register to read or write management data interface (mdi) registers in the internal phy or an external sgmii phy. see section 3.5.2.2.1 for details of the usage of this register the phy registers accessible through the mdic register are described in section 8.25 . field bit(s) initial value description data 15:0 x data. in a write command, software places the data bits and the mac shifts them out to the phy. in a read command, the mac reads these bits serially from the phy and software can read them from this location. regadd 20:16 0b phy register address: reg. 0, 1, 2,...31 phyadd 25:21 0b phy address op 27:26 0b opcode. 01b = mdi write 10b = mdi read all other values are reserved. r (rws) 28 1b ready bit. set to 1b by the 82576 at the end of the mdi transaction (for example, indication of a read or write completion). it should be reset to 0b by software at the same time the command is written. i 29 0b interrupt enable. when set to 1b by software, it causes an interrupt to be asserted to indicate the end of an mdi cycle. e (rws) 30 0b error. this bit is set to 1b by hardware when it fails to complete an mdi read. software should make sure this bit is clear (0b) before issuing an mdi read or write command. destination 31 0b destination. 0b = the transaction is to the internal phy. 1b = the transaction is directed to the external mdio interface.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 476 8.2.5 serdes ana - serdesctl (0x00024; r/w) 8.2.6 copper/fiber switch control - connsw (0x00034; r/w) field bit(s) initial value description data 7:0 0b data to serdes. address 15:8 0b address to serdes. reserved 30:16 0b reserved. done indication 31 1b when a write operation completes, this bit is set to 1b indicating that new data can be written. this bit is over written to 0b by new data. field bit(s) initial value description autosense_ en 0 0b auto sense enable. when set, the auto sense mode is active. in this mode the non-active link is sensed by hardware as follows phy sensing: the electrical idle detector of the receiver of the phy is activated while in serdes or sgmii mode. serdes sensing: the electrical idle detector of the receiver of the serdes is activated while in internal phy mode, assuming the enrgsrc bit is cleared. if energy is detected in the non active media, the omed bit in the icr register is set and this bit is cleared. this includes the case where energy was present at the non-active media when this bit is being set. autosense_ conf 1 0b auto sense config mode. this bit should be set during the configuration of the phy/ serdes towards the activation of the auto-sense mode. while this bit is set, the phy/serdes is active even though the active link is set to serdes or sgmii/phy. energy detection while this bit is set is not reflected to the omed interrupt. enrgsrc 2 0b 1 1. words 0x24 and 0x14 (bit 15) in the eeprom defines the default of the enrgsrc bit in this register for lan0 and lan1 respectively. serdes energy detect source. if set, the omed interrupt cause is set after asserting the external signal detect pin. if cleared, the omed interrupt cause is set after exiting from electrical idle of the serdes receiver. this bit also defines the source of the signal detect indication used to set link up while is serdes mode. reserved 8:3 0x0 reserved. serdesd (ro) 9 x serdes signal detect indication. indicates the serdes signal detect value according to the selected source (either external or internal). valid only if link_mode is serdes or sgmii. physd (ro) 10 x phy signal detect indication. valid only if link_mode is the phy and the receiver is not in electrical idle. reserved 31:11 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 477 8.2.7 vlan ether type - vet (0x00038; r/w) this register contains the type field hardware matches against to recognize an 802.1q (vlan) ethernet packet and uses when add and transmit vlan ethernet packets. to be compliant with the 802.3ac standard, this register should be programmed with the value 0x8100. for vlan transmission the upper byte is first on the wire (vet[15:8]). 8.2.8 led control - ledctl (0x00e00; rw) this register controls the setup of the leds. see section 7.5.1 for details of the mode fields encoding. field bit(s) initial value description vet 15:0 0x8100 vlan ethertype. should be programmed with 0x8100. vet ext 31:16 0x8100 external vlan ether type. field bit(s) initial value description led0_mode 3:0 0010b 1 led0/link# mode. this field specifies the control source for the led0 output. an initial value of 0010b selects link_up# indication. led_pci_mo de 40b 0b = use leds as defined in the other fields of this register. 1b = use leds to indicate pci-e lanes idle status in sdp mode (only when the led_mode is set to 0x8 ? sdp mode) for port 0 led0 3-0 indicates rx lanes 3- 0 electrical idle status for port 1 led1 3-0 indicates tx lanes 3- 0 electrical idle status global_ blink_mode 50b 1 global blink mode. this field specifies the blink mode of all the leds. 0b = blink at 200 ms on and 200 ms off. 1b = blink at 83 ms on and 83 ms off. led0_ivrt 6 0b 1 led0/link# invert. this field specifies the polarity/ inversion of the led source prior to output or blink control. 0b = do not invert led source. 1b = invert led source. led0_blink 7 0b 1 led0/link# blink. this field specifies whether to apply blink logic to the (possibly inverted) led control source prior to the led output. 0b = do not blink asserted led output. 1b = blink asserted led output. led1_mode 11:8 0011b 1 led1/activity# mode. this field specifies the control source for the led1 output. an initial value of 0011b selects filter activity# indication. reserved 12 0b reserved. read-only as 0b. write as 0b for future compatibility.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 478 8.3 packet buffers control register descriptions these registers set the on-chip receive, transmit & loopback storage allocation. the partitioning size is 1 kb. note: programming these registers automatically initialize internal packet-buffer ram pointers. for best performance, the transmit buffer allocation should be set to accept two full-sized packets (for good 9500 byte jumbo frame performance, the transmit allocation should be a minimum of 18 kb). transmit packet buffer size should be configured to be more than 8 kb. 8.3.1 rx pb size - rxpbs (0x2404; rw) reserved 13 0b reserved. led1_ivrt 14 0b 1 led1/activity# invert led1_blink 15 1b 1 led1/activity# blink led2_mode 19:16 0110b 1 led2/link100# mode this field specifies the control source for the led2 output. an initial value of 0011b selects link100# indication. reserved 20 0b reserved. read-only as 0b. write as 0b for future compatibility. reserved 21 0b reserved. led2_ivrt 22 0b 1 led2/link100# invert led2_blink 23 0b 1 led2/link100# blink led3_mode 27:24 0111b 1 led3/link1000# mode. this field specifies the control source for the led3 output. an initial value of 0111b selects link1000# indication. reserved 28 0b reserved. read-only as 0b. write as 0b for future compatibility. reserved 29 0b reserved. led3_ivrt 30 0b 1 led3/link1000# invert led3_blink 31 0b 1 led3/link1000# blink 1. these bits are read from eeprom words 0x1c and 0x1f for port a and from eeprom words 0x2b and 0x2c for port b. field bit(s) initial value description rxpbsize0 6:0 0x40 rx packet buffer size. value is in kbytes. the default is 64k. reserved 31:7 0x0 reserved. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 479 8.3.2 tx pb size - txpbs (0x3404; rw) 8.3.3 switch pb size - swpbs (0x3004; rw) 8.3.4 tx packet buffer wrap around counter - pbtwac (0x34e8; ro) 8.3.5 rx packet buffer wrap around counter - pbrwac (0x24e8; ro) field bit(s) initial value description txpbsize0 5:0 0x28 tx packet buffer size. value is in kbytes. the default is 40k. reserved 31:6 0x0 reserved. field bit(s) initial value description swpbsize0 4:0 0x14 switch packet buffer size. value is in kbytes. the default is 20k. reserved 31:5 0x0 reserved. field bit(s) initial value description wac0 2:0 0x0 reflects the wrap around of the entire packet buffer. tc0e 3 1b tx packet buffer is empty reserved 31:4 0x0 reserved. field bit(s) initial value description wac0 2:0 0x0 reflects the wrap around of the entire packet buffer. tc0e 3 1b rx packet buffer is empty. reserved 31:4 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 480 8.3.6 switch packet buffer wrap around counter - pbswac (0x30e8; ro) 8.4 eeprom/flash register descriptions note: as the eeprom and the flash are shared resources between the two ports and the manageability firmware, access to these resources should be coordinated using the semaphore mechanism. see section 4.5.12 for details. 8.4.1 eeprom/flash control register - eec (0x00010; r/w) this register provides software direct access to the eeprom. software can control the eeprom by successive writes to this register. data and address information is clocked into the eeprom by software toggling the ee_sk and ee_di bits (0 and 2) of this register with ee_cs set to 0b. data output from the eeprom is latched into the ee_do bit (bit 3) via the internal 62.5 mhz clock and can be accessed by software via reads of this register. note: attempts to write to the flash device via pcie bar or via i/o access when writes are disabled (fwe is not equal to 10b) should not be attempted. behavior after such an operation is undefined and can result in component and/or system hangs. bit banging access to the flash via fla register is not protected by this field. field bit(s) initial value description wac0 2:0 0x0 reflects the wrap around of the entire packet buffer. tc0e 3 1b switch packet buffer is empty. reserved 31:4 0x0 reserved. field bit(s) initial value description ee_sk 0 0b clock input to the eeprom. when ee_gnt = 1b, the ee_sk output signal is mapped to this bit and provides the serial clock input to the eeprom. software clocks the eeprom via toggling this bit with successive writes. ee_cs 1 0b chip select input to the eeprom. when ee_gnt = 1b, the ee_cs output signal is mapped to the chip select of the eeprom device. software enables the eeprom by writing a 1b to this bit. ee_di 2 0b data input to the eeprom. when ee_gnt = 1b, the ee_di output signal is mapped directly to this bit. software provides data input to the eeprom via writes to this bit. ee_do (ro) 3 x data output bit from the eeprom. the ee_do input signal is mapped directly to this bit in the register and contains the eeprom data output. this bit is ro from a software perspective; writes to this bit have no effect.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 481 fwe 5:4 01b flash write enable control these two bits, control whether writes to flash memory are allowed. 00b = flash erase (along with bit 31 in the fla register). 01b = flash writes disabled. 10b = flash writes enabled. 11b = not allowed. ee_req 6 0b request eeprom access. the software must write a 1b to this bit to get direct eeprom access. it has access when ee_gnt is 1b. when the software completes the access it must write a 0b. ee_gnt 7 0b grant eeprom access. when this bit is 1b the software can access the eeprom using the sk, cs, di, and do bits. ee_pres (ro) 8 1b eeprom present. this bit indicates that an eeprom is present by monitoring the ee_do input for an active-low acknowledge by the serial eeprom during initial eeprom scan. 1b = eeprom present. auto_rd (ro) 9 0b eeprom auto read done. when set to 1b, this bit indicates that the auto read by hardware from the eeprom is done. this bit is also set when the eeprom is not present or when its signature is not valid. ee_addr_si ze 10 0b eeprom address size. this field defines the address size of the eeprom. this bit is set by the eeprom size auto-detect mechanism. if no eeprom is present or the signature is not valid, a 16-bit address is assumed. 0b = 8- and 9-bit. 1b = 16-bit. ee_size (ro) 14:11 0010b eeprom size this field defines the size of the eeprom: field value\eeprom size\eeprom address size 0000b 128 bytes - 1 byte 0001b 256 bytes - 1 byte 0010b 512 bytes - 1 byte 0011b 1 kbytes - 2 bytes 0100b 2 kbytes - 2 bytes 0101b 4 kbytes - 2 bytes 0110b 8 kbytes - 2 bytes 0111b 16 kbytes -2 bytes 1000b 32 kbytes - 2 bytes 1001b reserved 1 1011b - 1111b reserved reserved 31:15 0b reserved reads as 0b. 1. these bits are eeprom. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 482 8.4.2 eeprom read register - eerd (0x00014; rw) this register is used by software to cause the 82576 to read individual words in the eeprom. to read a word, software writes the address to the read address field and simultaneously writes a 1b to the start read field. the 82576 reads the word from the eeprom and places it in the read data field, setting the read done field to 1b. software can poll this register, looking for a 1b in the read done field, and then using the value in the read data field. when this register is used to read a word from the eeprom, that word does not influence any of the 82576's internal registers even if it is normally part of the auto-read sequence. 8.4.3 flash access - fla (0x0001c; r/w) this register provides software direct access to the flash. software can control the flash by successive writes to this register. data and address information is clocked into the flash by software toggling the fl_sck bit (bit 0) of this register with fl_ce set to 1b. data output from the flash is latched into the fl_so bit (bit 3) of this register via the internal 125 mhz clock and can be accessed by software via reads of this register. note: in the 82576, the flash access register is only reset at internal_power_on_reset and not as legacy devices at a software reset. field bit(s) initial value description start 0 0b start read. writing a 1b to this bit causes the eeprom to read a (16-bit) word at the address stored in the ee_addr field and then storing the result in the ee_data field. this bit is self-clearing. done (ro) 1 0b read done. set to 1b when the eeprom read completes. set to 0b when the eeprom read is not completed. writes by software are ignored. reset by setting the start bit. addr 15:2 0x0 read address. this field is written by software along with start read to indicate the word to read. data (ro) 31:16 x read data. data returned from the eeprom read. field bit(s) initial value description fl_sck 0 0b clock input to the flash. when fl_gnt is 1b, the fl_sck out signal is mapped to this bit and provides the serial clock input to the flash device. software clocks the flash memory via toggling this bit with successive writes. fl_ce 1 0b chip select input to the flash. when fl_gnt is 1b, the fl_ce output signal is mapped to the chip select of the flash device. software enables the flash by writing a 0b to this bit. fl_si 2 0b data input to the flash. when fl_gnt is 1b, the fl_si output signal is mapped directly to this bit. software provides data input to the flash via writes to this bit.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 483 8.4.4 flash opcode - flashop (0x0103c; r/w) this register enables the host or the firmware to define the op-code used in order to erase a sector of the flash or the complete flash. this register is reset only at internal_power_on_reset assertion. this register is common to both ports and to the manageability and should be programmed according to the parameters of the flash used. note: the default values fit to atmel* serial flash memory devices. 8.4.5 eeprom diagnostic - eediag (0x01038; ro) this register reflects the values of eeprom bits influencing the hardware that are not reflected otherwise. fl_so 3 x data output bit from the flash. the fl_so input signal is mapped directly to this bit in the register and contains the flash memory serial data output. this bit is read only from the software perspective ? writes to this bit have no effect. fl_req 4 0b request flash access. the software must write a 1b to this bit to get direct flash memory access. it has access when fl_gnt is 1b. when the software completes the access it must write a 0b. fl_gnt 5 0b grant flash access. when this bit is 1b, the software can access the flash memory using the fl_sck, fl_ce, fl_si, and fl_so bits. fla_add_size 6 0b flash address size when flash_add_size is set, all flashes (including 64 kb) are accessed using 3 bytes of the address. if this bit is set by one of the functions, it is also reflected in the other one. reserved 29:7 0b reserved. reads as 0b. fl_busy 30 0b flash busy. this bit is set to 1b while a write or an erase to the flash memory is in progress. while this bit is clear (read as 0b) software can access to write a new byte to the flash device. fl_er 31 0b flash erase command. this command is sent to the flash component only if r.fwe field is cleared. this bit is automatically cleared and read as 0b. field bit(s) initial value description derase 7:0 0x0062 flash device erase instruction. the op-code for the flash erase instruction. serase 15:8 0x0052 flash block erase instruction. the op-code for the flash block erase instruction. relevant only to flash access by manageability. reserved 31:16 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 484 8.4.6 eeprom auto read bus control - eearbc (0x01024; r/w) in eeprom-less implementations, this register is used to program the 82576 the same way it should be programmed if an eeprom was present. see section 3.3.1.7.1 for details of this register usage. this register is common to both functions and should be accessed only after the coordination with the other port. field bit(s) initial value description lan0 disable strap behavior 0 0b reflects the inverse of bit 13 in eeprom word 0x20 controlling behavior of disabling strap for lan0. lan1 disable strap behavior 1 0b reflects the inverse of bit 13 in eeprom word 0x10 controlling behavior of disabling strap for lan1. lan1 disable 2 0b reflects bit 11 in eeprom word 0x10 controlling the disabling of lan1 as pcie. lan1 pci disable 3 0b reflects bit 10 in eeprom word 0x10 controlling the disabling of lan1 as pcie. eeprom deadlock release enable 4 0b reflects bit 5 in eeprom word 0x0a controlling the eeprom deadlock release enable. dynamic iddq enable 5 0b reflects bit 15 in eeprom word 0x1e controlling the dynamic iddq enable. pll shutdown enable 6 0b reflects bit 4 in eeprom 0x0f controlling the pll shutdown enable control. pll switch 7 0b reflects bit 5 in eeprom word 0x21 controlling the timing of the switch to pll clock. nc- si clock out 8 0b reflects the clock out setting in bit 13 of eeprom word 0x21. nc-si clock and i/o pads strength 10:9 00b reflects the clock and i/o pad drive strength settings in bits 15:14 of eeprom word 0x21. sdp_iddq_e n 11 0b reflects sdp behavior in the dynamic iddq setting in bit 6 of eeprom word 0xa. eeprom parallel state 13:12 x state of the eeprom parallel access arbitration state machine. eeprom serial state 15:14 x state of the eeprom serial access arbitration state machine. flash serial state 17:16 x state of the flash serial access arbitration state machine. flash read data state 19:18 x state of the flash read data bus arbitration state machine. flash parallel state 22:20 x state of the flash parallel access arbitration state machine. reserved 30:23 0x0 reserved. deadlock release 31 x indicates a deadlock condition was detected in the eeprom and the current grant was released.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 485 1. more than one valid bit can be set for write accesses. this results in writing the specific address to more than one destinat ion. 2. not all eeprom addresses are part of the auto read. by using this register software can write to the hardware registers that are configured during auto read only. 3. write access to address 0x12 in the eeprom is protected if a valid eeprom exist. this limitation protects the secured eeprom mechanism. 8.4.7 vpd diagnostic register -vpddiag (0x1060; ro) this register stores the vpd parameters as parsed by the auto-load process. this register is used for debug only. field bit(s) initial value description valid_core 0 0 0b valid write active to core 0. write strobe to core 0. firmware/software sets this bit for write access. software should clear this bit to terminate the write transaction. valid_core 1 1 0b valid write active to core 1. write strobe to core 1. firmware/software sets this bit for write access. software should clear this bit to terminate the write transaction. valid_ common 2 0b valid write active to common. write strobe to common. firmware/software sets this bit for write access. software should clear this bit to terminate the write transaction. reserved 3 0b reserved. reads as 0b. addr 12:4 0x0 write address. this field specifies the 9-bit lower bit of the word address of the eeprom data. reserved 15:13 000b reserved. reads as 0b. data 31:16 0x0 data written into the eeprom auto read bus. field bit(s) initial value description valid 0 x vpd structure valid. reserved 4:1 x reserved. rd tag 13:5 x offset of the read tag in vpd relative to the start of vpd (in bytes). wr tag 22:14 x offset of the write tag in vpd relative to the start of vpd (in bytes). end tag 31:23 x offset of the end tag in vpd relative to the start of vpd (in bytes).
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 486 8.4.8 mng-eeprom csr i/f the following registers are reserved for firmware access to the eeprom and are not writable by the host. 8.4.8.1 mng eeprom control register - eemngctl (0x1010; ro) field bit(s) initial value description addr 14:0 0x0 address. this field is written by mng along with start read or start write to indicate the eeprom address to read or write. start 15 0b start. writing a 1b to this bit causes the eeprom to start the read or write operation according to the write bit. write 16 0b write - this bit tells the eeprom if the current operation is read or write: 0b = read 1b = write eebusy 17 0b eeprom busy this bit indicates that the eeprom is busy processing an eeprom transaction and shouldn?t be accessed. cfg_done 0 18 0b mng configuration cycle done for port 0. this bit indicates that the mng configuration cycle (configuration of serdes, phy, pcie and plls) is done for port 0. this bit is set to 1b by mng firmware to indicate configuration done and cleared at by hardware on any of the reset sources that cause the firmware to init the phy. write 0b by the firmware does not affect the state of this bit. note: port 0 driver should not try to access the phy for configuration before this bit is set. note: when the lan function select bit in the eeprom (word 0x21 bit 12 - see section 6.2.22 ), this bit indicates that the mng configuration cycle is done for port 1. cfg_done 1 19 0b mng configuration cycle done for port 1. this bit indicates that the mng configuration cycle (configuration of serdes, phy, pcie and plls) is done for port 1. this bit is set to 1b by mng firmware to indicate configuration done and cleared at by hardware on any of the reset sources that cause the firmware to init the phy. write 0b by the firmware does not affect the state of this bit. note: port 1 driver should not try to access the phy for configuration before this bit is set. note: when the lan function select bit in the eeprom (word 0x21 bit 12 - see section 6.2.22 ), this bit indicates that the mng configuration cycle is done for port 0. reserved 30:20 0x0 reserved. done 31 1b transaction done. this bit is cleared after start write or start read bit is set by the mng and is set back again when the eeprom write or read transaction is done.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 487 8.4.8.2 mng eeprom read/write data - eemngdata (0x1014; ro) 8.5 flow control register descriptions 8.5.1 flow control address low - fcal (0x00028; ro) flow control packets are defined by 802.3x to be either a unique multicast address or the station address with the ether type field indicating pause. the fca registers provide the value hardware uses to compare incoming packets against, to determine that it should pause its output. the fcal register contains the lower bits of the internal 48-bit flow control ethernet address. all 32 bits are valid. software can access the high and low registers as a register pair if it can perform a 64- bit access to the pcie bus. the complete flow control multicast address is: 0x01_80_c2_00_00_01; where 0x01 is the first byte on the wire, 0x80 is the second, etc. note: any packet matching the contents of {fcah, fcal, fct} when ctrl.rfce is set is acted on by the 82576. whether flow control packets are passed to the host (software) depends on the state of the rctl.dpf bit and whether the packet matches any of the normal filters 8.5.2 flow control address high - fcah (0x0002c; ro) this register contains the upper bits of the 48-bit flow control ethernet address. only the lower 16 bits of this register have meaning. the complete flow control address is {fcah, fcal}. the complete flow control multicast address is: 0x01_80_c2_00_00_01; where 0x01 is the first byte on the wire, 0x80 is the second, etc. 8.5.3 flow control type - fct (0x00030; r/w) this register contains the type field that hardware matches to recognize a flow control packet. only the lower 16 bits of this register have meaning. this register should be programmed with 0x88_08. the upper byte is first on the wire fct[15:8]. field bit(s) initial value description wrdata 15:0 0x0 write data - data to be written to the eeprom. rddata 31:16 ? read data - data returned from the eeprom read. field bit(s) initial value description fcal 31:0 0x00c28001 flow control address low field bit(s) initial value description fcah 15:0 0x0100 flow control address high. should be programmed with 0x01_00. reserved 31:16 0b reserved. reads as 0b.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 488 8.5.4 flow control transmit timer value - fcttv (0x00170; r/ w) the 16-bit value in the ttv field is inserted into a transmitted frame (either xoff frames or any pause frame value in any software transmitted packets). it counts in units of slot time of 64 bytes. if software needs to send an xon frame, it must set ttv to 0b prior to initiating the pause frame. 8.5.5 flow control receive threshold low - fcrtl0 (0x02160; r/w) this register contains the receive threshold used to determine when to send an xon packet the complete register reflects the threshold in units of bytes. the lower 4 bits must be programmed to 0b (16 byte granularity). software must set xone to enable the transmission of xon frames. each time hardware crosses the receive-high threshold (becoming more full), and then crosses the receive-low threshold and xone is enabled (1b), hardware transmits an xon frame. when xone is set, the rtl field should be programmed to at least 1b (at least 16 bytes). flow control reception/transmission are negotiated capabilities by the auto-negotiation process. when the 82576 is manually configured, flow control operation is determined by the ctrl.rfce and ctrl.tfce bits. field bit(s) initial value description fct 15:0 0x8808 flow control type. reserved 31:16 0b reserved. reads as 0b. field bit(s) initial value description ttvtc0 15:0 x transmit timer value these bits are included in the xoff frame reserved 31:16 0b reserved. field bit(s) initial value description reserved 3:0 0000b reserved. must be written with 0b. rtl 15:4 0x0 receive threshold low. fifo low water mark for flow control transmission. an xon packet is sent if the occupied space in the packet buffer is smaller or equal than this watermark. this field is in 16 bytes granularity. reserved 30:16 0x0 reserved. should be written with 0b for future compatibility. reads as 0b. xone 31 0b xon enable. 0b = disabled. 1b = enabled.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 489 8.5.6 flow control receive threshold high - fcrth0 (0x02168; r/w) this register contains the receive threshold used to determine when to send an xoff packet. the complete register reflects the threshold in units of bytes. this value must be at maximum 48 bytes less than the maximum number of bytes allocated to the receive packet buffer (rxpbs.rxpbsize1), and the lower 4 bits must be programmed to 0b (16 byte granularity). the value of rth should also be bigger than fcrtl.rtl. each time the receive fifo reaches the fullness indicated by rth, hardware transmits a pause frame if the transmission of flow control frames is enabled. flow control reception/transmission are negotiated capabilities by the auto-negotiation process. when the 82576 is manually configured, flow control operation is determined by the ctrl.rfce and ctrl.tfce bits. 8.5.7 flow control refresh threshold value - fcrtv (0x02460; r/w) 8.5.8 flow control status - fcsts0 (0x2464; ro) this register describes the status of the flow control machine. field bit(s) initial value description reserved 3:0 000b reserved. must be written with 0b. rth 15:4 0x0 receive threshold high. fifo high water mark for flow control transmission. an xoff packet is sent if the occupied space in the packet buffer is bigger or equal than this watermark. this field is in 16 bytes granularity. reserved 31:16 0x0 reserved. must be set to 0b. field bit(s) initial value description fc_refresh_t h 15:0 0x0 flow control refresh threshold. this value indicates the threshold value of the flow control shadow counter; when the counter reaches this value, and the conditions for pause state are still valid (buffer fullness above low threshold value), a pause (xoff) frame is sent to link partner. if this field contains zero value, the flow control refresh is disabled. note: this register controls both tcs. reserved 31:16 - reserved
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 490 8.6 pcie register descriptions 8.6.1 pcie control - gcr (0x05b00; rw) field bit(s) initial value description flow_control state 0 0b flow control state machine signal 0b = xon 1b = xoff above high 1 the size of data in the memory is above the high threshold. below low 2 the size of data in the memory is below the low threshold. reserved 15:3 0x0 reserved. refresh counter 31:16 0x0 flow control refresh counter. field bit(s) initial value description ignore rid 0 0x0 when set, the rid of all dma accesses is the pf rid. this bit should be kept at 0b for normal operation. reserved 1 0b reserved. firmware self_test_en able 8 0b when set, firmware should perform a self test. reset at power good only. rx_l0s_adju stment 9 1b if set, the replay timer always adds the required l0s adjustment. when set to 0b, adds it only when tx l0s are active. reset at power good only. reserved 10 0 reserved. reserved 11 0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 491 completion_ timeout_val ue (ro or rw1) 15:12 0x0 indicates the selected value for completion timeout. decoding of this field depends on the pcie capability version: capability version = 1 (bits 13:12): 00b = 50 ? s to 10 ms (default) 01b = 10 ms to 200 ms 10b = 200 ms to 4 s 11b = 4 s to 64 s bits 15:14 are reserved capability version = 2: 0000b = 50 ? s to 50 ms 0001b = 50 ? s to 100 ? s 0010b = 1 ms to 10 ms 0011b = reserved 0100b = reserved 0101b = 16 ms to 55 ms 0110b = 65 ms to 210 ms 0111b = reserved 1000b = reserved 1001b = 260 ms to 900 ms 1010b = 1 s to 3.5 s 1011b = reserved 1100b = reserved 1101b = 4 s to 13 s 1110b = 17 s to 64 s 1111b = reserved reset at power good only. completion_ timeout_res end 16 1b when set, enables re-sending of a request once the completion timeout expired. 0b = do not re-send request on completion timeout. 1b = re-send request on completion timeout. this bit is used no matter which timeout mechanism is used. reset at power good only. completion_ timeout_dis able (ro or rw1) 17 0b indicates if pcie completion timeout is supported 0b = completion timeout enabled. 1b = completion timeout disabled. reset at power good only. pcie capability version (ro) 18 1b 1 reports the pcie capability version supported 0b = capability version = 0x1. 1b = capability version = 0x2. reserved 19 0 reserved. pba_cl_dea s 20 0b if cleared, pba is cleared on de-assertion of msi-x request. hdr_log inversion 21 0b if set, the header log in error reporting is written as 31:0 to log1, 63:643 in log2. if not set, the header is written as 127:96 in log1 95:64 in log 2.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 492 8.6.2 iov control- iovctl (0x05bbc; rw) 8.6.3 function tag - functag (0x05b08; r/w) reserved 23:22 1b reserved. must be set to 01b. l0s_entry_la t 24 0b l0s entry latency set to 0b to indicate l0s entry latency is the same as l0s exit latency. set to 1b to indicate l0s entry latency is (l0s exit latency/4). l1_entry_lat ency (ro) 26:25 11b determines the idle time of the pcie link in l0s state before initiating a transition to l1 state. initial value is loaded from the eeprom. 00b - 64 ? s 01b - 256 ? s 10b - 1 ms 11b - 4 ms reserved 31:27 0b reserved. 1. the default value for this field is read from eeprom word 0x18 bits 11:10. if these bits are set to 10b, then this fields is set to 1, otherwise it is set to zero. field bit(s) initial value description use vf queues (wo) 0 0b should be set at least 1 ms after iov was disabled and before the pf reuses queues previously assigned to vfs. if the pf does not re-use the queues, there is no need to set this bit. reserved 30:1 0x0 reserved. use vf queues enable 31 1b if set, then the use vf queues bit should be set before re- using the queues. if not set, vf enable in the config space should be cleared only after all vfs had be quiesced. field bit(s) initial value description cnt_0_tag 4:0 0x0 tag number for event 6/1d, if located in counter 0. cnt_0_func 7:5 0x0 function number for event 6/1d, if located in counter 0. cnt_1_tag 12:8 0x0 tag number for event 6/1d, if located in counter 1. cnt_1_func 15:13 0x0 function number for event 6/1d, if located in counter 1. cnt_2_tag 20:16 0x0 tag number for event 6/1d, if located in counter 2. cnt_2_func 23:21 0x0 function number for event 6/1d, if located in counter 2. cnt_3_tag 28:24 0x0 tag number for event 6/1d, if located in counter 3. cnt_3_func 31:29 0x0 function number for event 6/1d, if located in counter 3.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 493 8.6.4 function active and power state to mng - factps (0x05b30; ro) firmware uses this register for configuration field bit(s) initial value description func0 power state 1:0 00b power state indication of function 0. 00b ? dr 01b ? d0u 10b ? d0a 11b ? d3 lan0 valid 2 0b lan 0 enable. when set to 0b, it indicates that the lan 0 function is disabled. when the function is enabled, the bit is set to 1b. the lan 0 enable is set by the lan 0 enable / test_point[2] strapping pin. func0 aux_en 3 0b function 0 auxiliary (aux) power pm enable bit shadow from the configuration space. reserved 5:4 0b reserved. func1 power state 7:6 00b power state indication of function 1 00b ? dr 01b ? d0u 10b ? d0a 11b ? d3 lan1 valid 8 0b lan 1 enable. when set to 0b, it indicates that the lan 1 function is disabled. when the function is enabled, the bit is set to 1b. the lan 1 enable is set by the lan 1 enable / test_point[3] strapping pin. func1 aux_en 9 0b function 1 auxiliary (aux) power pm enable bit shadow from the configuration space. reserved 28:10 0x0 reserved. mngcg 29 0b mng clock gated. when set, indicates that the manageability clock is gated. lan function sel 30 0b when both lan ports are enabled and the lan function sel equals 0b, lan 0 is routed to pcie function 0 and lan 1 is routed to pcie function 1. if the lan function sel equals 1b, lan 0 is routed to pcie function 1 and lan 1 is routed to pcie function 0. if any of the lan functions are disabled, the other one is routed to pcie function 0 regardless of the lan function sel. this bit is initiated by eeprom word 0x21. pm state changed 31 0b indication that one or more of the functions power states had changed. this bit is also a signal to the mng unit to create an interrupt. this bit is cleared on read, and is not set for at least 8 cycles after it was cleared.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 494 8.6.5 serdes/ccm/pcie csr - gioanactl0 (0x05b34; r/w) firmware uses this register for analog circuit configuration. 8.6.6 serdes/ccm/pcie csr - gioanactl1 (0x05b38; r/w) firmware uses this register for analog circuit configuration. 8.6.7 serdes/ccm/pcie csr - gioanactl2 (0x05b3c; r/w) firmware uses this register for analog circuit configuration. 8.6.8 serdes/ccm/pcie csr - gioanactl3 (0x05b40; r/w) firmware uses this register for analog circuit configuration. field bit(s) initial value description data 7:0 0b data to serdes. address 15:8 0b address to serdes. reserved 30:16 0b reserved. done indication 31 1b when a write operation is completed this bit is set to 1b indicating that new data can be written. this bit is over written to 0b by new data. field bit(s) initial value description data 7:0 0b data to serdes. address 15:8 0b address to serdes. reserved 30:16 0b reserved. done indication 31 1b when a write operation is completed this bit is set to 1b indicating that new data can be written. this bit is over written to 0b by new data. field bit(s) initial value description data 7:0 0b data to serdes. address 15:8 0b address to serdes. reserved 30:16 0b reserved. done indication 31 1b when a write operation is completed this bit is set to 1b indicating that new data can be written. this bit is over written to 0b by new data. field bit(s) initial value description data 7:0 0b data to serdes.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 495 8.6.9 serdes/ccm/pcie csr - gioanactlall (0x05b44; r/w) firmware uses this register for analog circuit configuration. 8.6.10 serdes/ccm/pcie csr - ccmctl (0x05b48; r/w) firmware uses this register for analog circuit configuration. 8.6.11 serdes/ccm/pcie csr - scctl (0x05b4c; r/w) firmware uses this register for analog circuit configuration. address 15:8 0b address to serdes. reserved 30:16 0b reserved. done indication 31 1b when a write operation is completed this bit is set to 1b indicating that new data can be written. this bit is over written to 0b by new data. field bit(s) initial value description data 7:0 0b data to serdes. address 15:8 0b address to serdes. reserved 30:16 0b reserved. done indication 31 1b when a write operation is completed this bit is set to 1b indicating that new data can be written. this bit is over written to 0b by new data. field bit(s) initial value description data 7:0 0b data to serdes. address 15:8 0b address to serdes. reserved 30:16 0b reserved. done indication 31 1b when a write operation is completed this bit is set to 1b indicating that new data can be written. this bit is over written to 0b by new data. field bit(s) initial value description data 7:0 0b data to serdes. address 15:8 0b address to serdes. reserved 30:16 0b reserved. done indication 31 1b when a write operation is completed this bit is set to 1b indicating that new data can be written. this bit is over written to 0b by new data.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 496 8.6.12 mirrored revision id - mrevid (0x05b64; r/w) 8.7 semaphore registers this section contains registers common to both cores used to coordinate between the two functions. the usage of these registers is described in section 4.5.12 8.7.1 software semaphore - swsm (0x05b50; r/w) field bit(s) initial value description eeprom revid 7:0 0x0 mirroring of revision id loaded from the eeprom in pci configuration space (from word 0x1e). default revid 15:8 0x0 mirroring of default rev id before an eeprom load. set to 0b. reserved 31:16 0x0 reserved field bit(s) initial value description smbi (rs) 0 0x0 software/software semaphore bit. this bit is set by hardware when this register is read by the device driver and cleared when the host driver writes a 0b to it. the first time this register is read, the value is 0b. in the next read the value is 1b (hardware mechanism). the value remains 1b until the software device driver clears it. this bit can be used as a semaphore between the two device's drivers in the 82576. this bit is cleared on gio soft reset. swesmbi 1 0x0 software/firmware semaphore bit. this bit should be set only by the device driver (read only to firmware). the bit is not set if bit 0 in the fwsm register is set. the device driver should set this bit and than read it to see if it was set. if it was set, it means that the device driver can access the sw_fw_sync register. the device driver should clear this bit after modifying the sw_fw_sync register. hardware clears this bit on gio soft reset. wmng (sc) 2 0x0 wake mng lock. when this bit is set, hardware wakes the mng clock (if gated). asserting this bit does not clear the cfg_done bit in the eemngctl register. this bit is self cleared on writes. eeur 3 0x0 eeprom update request. eeprom request update from firmware. software should clear this bit after the fwsmfw_valid bit is set. reserved 31:4 0x0 reserved
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 497 8.7.2 firmware semaphore - fwsm (0x05b54; r/ws) field bit(s) initial value description eep_fw_ semaphore 0 0x0 software/firmware semaphore. firmware should set this bit to 1b before accessing the sw_fw_sync register. if the software is using the swsm does not lock the sw_fw_sync, firmware is able to set this bit to 1b. firmware should set this bit to 0b after modifying the sw_fw_sync register. fw_mode 3:1 0x0 firmware mode. indicates the firmware mode as follows: 000b = no mng 001b = reserved 010b = pt mode 011b = reserved 100b = host interface enable only reserved 5:4 00b reserved. eep_reload_ ind 6 0x0 eeprom reloaded indication. set to 1b after firmware reloads the eeprom. cleared by firmware once the ?clear bit? host command is received from host software. reserved 14:7 0x0 reserved. fw_val_bit 15 0x0 firmware valid bit. hardware clears this bit in reset de-assertion so software can know firmware mode (bits 1-5) is invalid. firmware should set this bit to 1b when it is ready (end of boot sequence). reset_cnt 18:16 0x0 reset counter. firmware increments the count at every reset.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 498 8.7.3 software?firmware synchronization - sw_fw_sync (0x05b5c; rws) this register is intended to synchronize between software and firmware. this register is common to both ports 0 and 1. ext_err_ind 24:19 0x0 external error indication. firmware writes here the reason that the firmware has reset / clock gated. for example, eeprom, flash, patch corruption, etc. possible values: 0x00 = no error 0x01 = invalid eeprom checksum 0x02 = unlocked secured eeprom 0x03 = clock off host command 0x04 = invalid flash checksum 0x05 = c0 checksum failed 0x06 = c1 checksum failed 0x07 = c2 checksum failed 1 0x08 = c3 checksum failed 0x09 = tlb table exceeded 0x0a = dma load failed 0x0b = bad hardware version in patch load 0x0c = flash device not supported 0x0d = unspecified error x03f = reserved - max error value. pcie_ config_err_ ind 25 0x0 pcie configuration error indication. set to 1b by firmware when it fails to configure pcie interface. cleared by firmware upon successful configuration of pcie interface. phy_serdes 0_ config_err_i nd 26 0x0 phy/serdes0 configuration error indication. set to 1b by firmware when it fails to configure lan0 phy/ serdes. cleared by firmware upon successful configuration of lan0 phy/serdes. phy_serdes 1_ config_err_i nd 27 0x0 phy/serdes1 configuration error indication set to 1b by firmware when it fails to configure lan1 phy/ serdes. cleared by firmware upon successful configuration of lan1 phy/serdes. reserved 31:28 0x0 reserved. notes: 1. this register should be written only by the manageability firmware. the device driver should only read this register. 2. firmware ignores the eeprom semaphore in operating system hung states. 3. bits 15:0 are cleared on firmware reset.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 499 reset conditions: ? the software-controlled bits 15:0 are reset as any other csr on global resets, d3hot exit, software reset, and forced tco. software is expected to clear the bits on entry to d3 state. ? the firmware-controlled bits 31:16 are reset on internal_power_on_reset and firmware reset. 8.8 interrupt register descriptions 8.8.1 extended interrupt cause - eicr (0x01580; rc/w1c) this register contains the frequent interrupt conditions for the 82576. each time an interrupt causing event occurs, the corresponding interrupt bit is set in this register. an interrupt is generated each time one of the bits in this register is set and the corresponding interrupt is enabled via the interrupt mask set/read register. the interrupt might be delayed by the selected interrupt throttling register. note that the software device driver cannot determine from the rxtxq bits as to what was the cause of the interrupt: ? receive descriptor write back, receive descriptor minimum threshold hit, low latency interrupt for rx, transmit descriptor write back. writing a 1b to any bit in the register clears that bit. writing a 0b to any bit has no effect on that bit. register bits are cleared on register read. field bit(s) initial value description sw_eep_sm 0 0b when set to 1b, eeprom access is owned by software. sw_phy_sm 0 1 0b when set to 1b, phy 0 access is owned by software. sw_phy_sm 1 2 0b when set to 1b, phy 1 access is owned by software. sw_mac_cs r_sm 3 0b when set to 1b, software owns access to shared csrs. sw_flash_ sm 4 0 when set to 1b, software owns access to the flash. reserved 15:5 0x0 reserved for future use. fw_eep_sm 16 0b when set to 1b, eeprom access is owned by firmware. fw_phy_sm 0 17 0b when set to 1b, phy 0 access is owned by firmware. fw_phy_sm 1 18 0b when set to 1b, phy 1 access is owned by firmware. fw_mac_cs r_sm 19 0b when set to 1b, firmware owns access to shared csr.s fw_flash_s m 20 0 when set to 1b, firmware owns access to the flash. reserved 31:21 0x0 reserved for future use.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 500 auto clear can be enabled for any or all of the bits in this register. table 8-9. eicr register bit description - non msi-x mode (gpie.multiple_msix = 0) table 8-10. eicr register bit description - msi-x mode (gpie.multiple_msix = 1) note: in iov mode, only bit zero of this vector is available for the pf function. 8.8.2 extended interrupt cause set - eics (0x01520; wo) software uses this register to set an interrupt condition. any bit written with a 1b sets the corresponding bit in the extended interrupt cause read register. an interrupt is then generated if one of the bits in this register is set and the corresponding interrupt is enabled via the extended interrupt mask set/read register. bits written with 0b are unchanged. note: in order to set bit 31 of the eicr (other causes), the ics and ims registers should be used in order to enable one of the legacy causes. field bit(s) initial value description rxtxq 15:0 0x0 receive/transmit queue interrupts. one bit per queue or a bundle of queues, activated on receive/transmit queue events for the corresponding bit, such as: ? receive descriptor write back ? receive descriptor minimum threshold hit ? transmit descriptor write back the mapping of actual queue to the appropriate rxtxq bit is according to the ivar registers. reserved 29:16 0x0 reserved. tcp timer 30 0b tcp timer expired. activated when the tcp timer reaches its terminal count. other cause 31 0b interrupt cause active. activated when any bit in the icr register is set. field bit(s) initial value description msix 24:0 0x0 indicates an interrupt cause mapped to msi-x vectors 24:0 reserved 31:25 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 501 table 8-11. eics register bit description - non msi-x mode (gpie.multiple_msix = 0) table 8-12. eics register bit description - msi-x mode (gpie.multiple_msix = 1) 8.8.3 extended interrupt mask set/read - eims (0x01524; rws) reading of this register returns which bits have an interrupt mask set. an interrupt in eicr is enabled if its corresponding mask bit is set to 1b and disabled if its corresponding mask bit is set to 0b. a pci interrupt is generated each time one of the bits in this register is set and the corresponding interrupt condition occurs (subject to throttling). the occurrence of an interrupt condition is reflected by having a bit set in the extended interrupt cause read register. an interrupt might be enabled by writing a 1b to the corresponding mask bit location (as defined in the eicr register) in this register. any bits written with a 0b are unchanged. as a result, if software needs to disable an interrupt condition that had been previously enabled, it must write to the extended interrupt mask clear register rather than writing a 0b to a bit in this register. table 8-13. eims register bit description - non msi-x mode (gpie.multiple_msix = 0) field bit(s) initial value description rxtxq 15:0 0x0 sets to corresponding eicr rxtxq interrupt condition. reserved 29:16 0x0 reserved. tcp timer 30 0b sets the corresponding eicr tcp interrupt condition. reserved 31 0b reserved. field bit(s) initial value description msix 24:0 0x0 sets to corresponding eicr bit of msi-x vectors 24:0. reserved 31:25 0x0 reserved. field bit(s) initial value description rxtxq 15:0 0x0 set mask bit for the corresponding eicr rxtxq interrupt. reserved 29:16 0x0 reserved. tcp timer 30 0b set mask bit for the corresponding eicr tcp timer interrupt condition. other cause 31 1b set mask bit for the corresponding eicr other cause interrupt condition.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 502 table 8-14. eims register bit description - msi-x mode (gpie.multiple_msix = 1) 8.8.4 extended interrupt mask clear - eimc (0x01528; wo) this register provides software a way to disable certain or all interrupts. software disables a given interrupt by writing a 1b to the corresponding bit in this register. on interrupt handling, the software device driver should set all the bits in this register related to the current interrupt request even though the interrupt was triggered by part of the causes that were allocated to this vector. interrupts are presented to the bus interface only when the mask bit is set to 1b and the cause bit is set to 1b. the status of the mask bit is reflected in the extended interrupt mask set/read register and the status of the cause bit is reflected in the interrupt cause read register. software blocks interrupts by clearing the corresponding mask bit. this is accomplished by writing a 1b to the corresponding bit location (as defined in the eicr register) of that interrupt in this register. bits written with 0b are unchanged (their mask status does not change). table 8-15. eimc register bit description - non msi-x mode (gpie.multiple_msix = 0) table 8-16. eimc register bit description - msi-x mode (gpie.multiple_msix = 1) 8.8.5 extended interrupt auto clear - eiac (0x0152c; r/w) this register is mapped like the eics, eims, and eimc registers, with each bit mapped to the corresponding msi-x vector. field bit(s) initial value description msix 24:0 0x0 set mask bit for the corresponding eicr bit of msi-x vectors 24:0. reserved 31:25 0x0 reserved. field bit(s) initial value description rxtxq 15:0 0x0 clear mask bit for the corresponding eicr rxtxq interrupt. reserved 29:16 0x0 reserved. tcp timer 30 0b clear mask bit for the corresponding eicr tcp timer interrupt. other cause 31 1b clear mask bit for the corresponding eicr other cause interrupt. field bit(s) initial value description msix 24:0 0x0 clear mask bit for the corresponding eicr bit of msi-x vectors 24:0. reserved 31:25 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 503 this register is relevant to msi-x mode only, where read-to-clear can not be used, as it might erase causes tied to other vectors. if any bits are set in eiac, the eicr register should not be read. bits without auto clear set, need to be cleared with write-to-clear. eicr bits that have auto clear set are cleared by the internal emission of the corresponding msi-x message even if this vector is disabled by the operating system. the msi-x message can be delayed by eitr moderation from the time the eicr bit is activated. when using iov, the bits that correspond to msi-x vectors that are assigned to a vf are read-only. use vteiac to write these bits. see section 8.26.2.2, msi-x registers for the mapping of msi-x vectors to vfs. 8.8.6 extended interrupt auto mask enable - eiam (0x01530; r/ w) each bit in this register enables clearing of the corresponding bit in eims following read- or write-to- clear to eicr or setting of the corresponding bit in eims following a write-to-set to eics. in msi-x mode, this register controls which of the bits in eimc to clear upon interrupt generation. when using iov, the bits that correspond to msi-x vectors that are assigned to a vf are read-only. use vteiac to write these bits. see section 8.26.2.2, msi-x registers for the mapping of msi-x vectors to vfs. table 8-17. eiam register bit description - non msi-x mode (gpie.multiple_msix = 0) table 8-18. eiam register bit description - msi-x mode (gpie.multiple_msix = 1) field bit(s) initial value description msix 24:0 0x0 auto clear bit for the corresponding eicr bit of msi-x vectors 24:0. reserved 31:25 0x0 reserved. field bit(s) initial value description rxtxq 15:0 0x0 auto mask bit for the corresponding eicr rxtxq interrupt. reserved 29:16 0x0 reserved. tcp timer 30 0b auto mask bit for the corresponding eicr tcp timer interrupt condition. other cause 31 0b auto mask bit for the corresponding eicr other cause interrupt condition. field bit(s) initial value description msix 24:0 0x0 auto mask bit for the corresponding eicr bit of msi-x vectors 24:0. reserved 31:25 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 504 8.8.7 interrupt cause read register - icr (0x01500; rc/w1c) this register contains the interrupt conditions for the 82576 that are not present directly in the eicr. each time an icr interrupt causing event occurs, the corresponding interrupt bit is set in this register. the eicr.other bit reflects the setting of interrupt causes from icr as masked by the interrupt mask set/read register. each time all un-masked causes in icr are cleared, the eicr.other bit is also cleared. icr bits are cleared on register read. clear-on-read can be enabled/disabled through a general configuration register bit. auto clear is not available for the bits in this register. in order to prevent unwanted lsc interrupts during initialization, software should disable this interrupt until the end of initialization. field bit(s) initial value description txdw 0 0b transmit descriptor written back. set when the 82576 writes back a tx descriptor to memory. reserved 1 0b reserved. should be set to 0b for compatibility. lsc 2 0b link status change. this bit is set each time the link status changes (either from up to down, or from down to up). this bit is affected by the link indication from the phy (internal phy mode). reserved 3 0b reserved. rxdmt0 4 0b receive descriptor minimum threshold reached. indicates that the minimum number of receive descriptors are available and software should load more receive descriptors. macsec 5 0b indicates that the tx macsec packet counter reached the threshold requiring key exchange. rxo 6 0b receiver overrun. set on receive data fifo overrun. could be a result caused by no available receive buffers or because pcie receive bandwidth is inadequate. rxdw 7 0b receiver descriptor write back. set when the 82576 writes back an rx descriptor to memory. vmmb 8 0b set in iov mode when a vf sends a message or an acknowledge of a message to the pf. also set, when an flr is asserted for one of the vfs. reserved 9 0b reserved reserved 10 0b reserved. gpi_sdp0 11 0b general purpose interrupt on sdp0. if gpi interrupt detection is enabled on this pin (via ctrl.sdp0_gpien), this interrupt cause is set when the sdp0 is sampled high.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 505 gpi_sdp1 12 0b general purpose interrupt on sdp1. if gpi interrupt detection is enabled on this pin (via ctrl.sdp1_gpien), this interrupt cause is set when the sdp1 is sampled high. gpi_sdp2 13 0b general purpose interrupt on sdp2. if gpi interrupt detection is enabled on this pin (via ctrl_ext.sdp2_gpien), this interrupt cause is set when the sdp2 is sampled high. gpi_sdp3 14 0b general purpose interrupt on sdp3. if gpi interrupt detection is enabled on this pin (via ctrl_ext.sdp3_gpien) this interrupt cause is set when the sdp3 is sampled high. ptrap 15 0b probe trap interrupt . when set, the probe mode trap test mode trapped the requested event. reserved 17:16 000b reserved. mng 18 0b manageability event detected. indicates that a manageability event happened. when the 82576 is at power down mode, the ipmi can generate a pme for the same events that would cause an interrupt when the 82576 is at the d0 state. reserved 19 0b reserved omed 20 0b other media energy detect. when in serdes/sgmii mode, indicates that link status has changed on the 1000base-t phy or when in 1000base-t phy mode, there is a change in serdes/sgmii link status. reserved 21 0b reserved. fer 22 0b fatal error. this bit is set when a fatal error is detected in one of the memories nfer 23 0b non fatal error. this bit is set when a non fatal error is detected in one of the memories csrto 24 0b this bit is set when upon a csr access time out indication. sce 25 0b storm control event. this bit is set when multicast or broadcast storm control mechanism is activated or de-activated. software wd 26 0b software watchdog. this bit is set after a software watchdog timer times out. reserved 27 0b reserved. mddet 28 0b detected malicious driver behavior occurs when one of the queues used malformed descriptors or when one of the anti spoof checks triggered. in virtualized systems, might indicate a malicious or buggy driver. note: this bit should never rise during normal operation. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 506 8.8.8 interrupt cause set register - ics (0x01504; wo) software uses this register to set an interrupt condition. any bit written with a 1b sets the corresponding interrupt. this results in the corresponding bit being set in the interrupt cause read register (see section 8.8.7 ). a pcie interrupt is generated if one of the bits in this register is set and the corresponding interrupt is enabled through the interrupt mask set/read register (see section 8.8.9 ). bits written with 0 are unchanged. reserved 29 0b reserved tcp timer 30 00b tcp timer interrupt. inta 31 0 interrupt asserted. indicates that the int line is asserted. can be used by driver in shared interrupt scenario to decide if the received interrupt was emitted by the 82576. this bit is not valid in msi/msi-x environments field bit(s) initial value description txdw 0 0b sets the transmit descriptor written back interrupt. reserved 1 - reserved. lsc 2 0b sets the link status change interrupt. reserved 3 0b reserved. rxdmt0 4 0b sets the receive descriptor minimum threshold hit interrupt. macsec 5 0b sets the macsec interrupt. rxo 6 0b sets the receiver overrun interrupt. rxdw 7 0b sets the receiver descriptor write back interrupt. vmmb 8 0b sets the vm mailbox interrupt. reserved 9 0b reserved reserved 10 0b reserved. gpi_sdp0 11 0b sets the general purpose interrupt, related to sdp0 pin. gpi_sdp1 12 0b sets the general purpose interrupt, related to sdp1 pin. gpi_sdp2 13 0b sets the general purpose interrupt, related to sdp2 pin. gpi_sdp3 14 0b sets the general purpose interrupt, related to sdp3 pin. ptrap 15 0 set the probe trap interrupt reserved 17:16 0b reserved. mng 18 0b sets the management event interrupt. reserved 19 0b reserved. omed 20 0b sets the other media energy detected interrupt. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 507 8.8.9 interrupt mask set/read register - ims (0x01508; r/w) reading this register returns bits that have an interrupt mask set. an interrupt is enabled if its corresponding mask bit is set to 1b and disabled if its corresponding mask bit is set to 0b. a pcie interrupt is generated each time one of the bits in this register is set and the corresponding interrupt condition occurs. the occurrence of an interrupt condition is reflected by having a bit set in the interrupt cause read register (see section 8.8.7 ). a particular interrupt can be enabled by writing a 1b to the corresponding mask bit in this register. any bits written with a 0b are unchanged. as a result, if software desires to disable a particular interrupt condition that had been previously enabled, it must write to the interrupt mask clear register (see section 8.8.10 ) rather than writing a 0b to a bit in this register. reserved 21 0b reserved. fer 22 0b sets the fatal error interrupt. nfer 23 0b sets the non fatal error interrupt. csrto 24 0b sets the csr access time out indication interrupt. sce 25 0b set the storm control event interrupt software wd 26 0b sets the software watchdog interrupt. reserved 27 0b reserved. doutsync 28 0b sets the dma tx out of sync interrupt. reserved 29 0b tcp timer 30 0b sets the tcp timer interrupt. reserved 31 0b reserved. field bit(s) initial value description txdw 0 0b sets/reads the mask for transmit descriptor written back interrupt. reserved 1 - reserved. lsc 2 0b sets/reads the mask for link status change interrupt. reserved 3 0b reserved. rxdmt0 4 0b sets/reads the mask for receive descriptor minimum threshold hit interrupt. macsec 5 0b sets/reads the mask for macsec interrupt. rxo 6 0b sets/reads the mask for receiver overrun interrupt. rxdw 7 0b sets/reads the mask for receiver descriptor write back interrupt. vmmb 8 0b sets/reads the mask for mailbox interrupt. reserved 9 0b reserved reserved 10 0b reserved. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 508 8.8.10 interrupt mask clear register - imc (0x0150c; wo) software uses this register to disable an interrupt. interrupts are presented to the bus interface only when the mask bit is set to 1b and the cause bit set to 1b. the status of the mask bit is reflected in the interrupt mask set/read register (see section 8.8.9 ), and the status of the cause bit is reflected in the interrupt cause read register (see section 8.8.7 ). reading this register returns the value of the ims register. software blocks interrupts by clearing the corresponding mask bit. this is accomplished by writing a 1b to the corresponding bit in this register. bits written with 0b are unchanged (their mask status does not change). in interrupt handling, the software device driver should set all the bits in this register related to the current interrupt request, even though the interrupt was triggered by part of the causes that were allocated to this vector. gpi_sdp0 11 0b sets/reads the mask for general purpose interrupt, related to sdp0 pin. gpi_sdp1 12 0b sets/reads the mask for general purpose interrupt, related to sdp1 pin. gpi_sdp2 13 0b sets/reads the mask for general purpose interrupt, related to sdp2 pin. gpi_sdp3 14 0b sets/reads the mask for general purpose interrupt, related to sdp3 pin. ptrap 15 0 set/read the mask for the probe trap interrupt reserved 17:16 0b reserved. mng 18 0b sets/reads the mask for management event interrupt. reserved 19 0b reserved. omed 20 0b sets/reads the mask for other media energy detected interrupt. reserved 21 0b reserved. fer 22 0b sets/reads the mask for the fatal error interrupt. nfer 23 0b sets/reads the mask for the non fatal error interrupt. csrto 24 0b sets/reads the mask for the csr access time out indication interrupt. sce 25 0b sets/reads the mask for the storm control event interrupt. software wd 26 0b sets/reads the mask for the software watchdog interrupt. reserved 27 0b reserved. outsync 28 0b sets/reads the mask for dma tx out of sync interrupt. reserved 29 0b tcp timer 30 0b sets/reads the mask for tcp timer interrupt. reserved 31 0b reserved. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 509 field bit(s) initial value description txdw 0 0b clears the mask for transmit descriptor written back interrupt. reserved 1 - reserved. lsc 2 0b clears the mask for link status change interrupt. reserved 3 0b reserved. rxdmt0 4 0b clears the mask for receive descriptor minimum threshold hit interrupt. macsec 5 0b clears the mask for macsec interrupt. rxo 6 0b clears the mask for receiver overrun interrupt. sets on receive data fifo overrun. rxdw 7 0b clears the mask for receiver descriptor write back interrupt. vmmb 8 0b clears the mask for vm mailbox interrupt. reserved 9 0b reserved reserved 10 0b reserved. gpi_sdp0 11 0b clears the mask for general purpose interrupt, related to sdp0 pin. gpi_sdp1 12 0b clears the mask for general purpose interrupt, related to sdp1 pin. gpi_sdp2 13 0b clears the mask for general purpose interrupt, related to sdp2 pin. gpi_sdp3 14 0b clears the mask for general purpose interrupt, related to sdp3 pin. ptrap 15 0 clears the mask for the probe trap interrupt reserved 17:16 0b reserved. mng 18 0b clears the mask for management event interrupt. reserved 19 0b reserved. omed 20 0b clears the mask for other media energy detected interrupt. reserved 21 0b reserved fer 22 0b clears the mask for the fatal error interrupt. nfer 23 0b clears the mask for the non fatal error interrupt. csrto 24 0b clears the mask for the csr access time out indication interrupt. sce 25 0b clears the mask for the storm control event interrupt. software wd 26 0b clears the mask for software watchdog interrupt. reserved 27 0b reserved. outsync 28 0b clears the mask for dma tx out of sync interrupt.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 510 8.8.11 interrupt acknowledge auto mask register - iam (0x01510; r/w) 8.8.12 interrupt throttle - eitr (0x01680 + 4*n [n = 0...24]; r/ w) each eitr is responsible for an interrupt cause (rxtxq, tcp timer and other cause). the allocation of eitr-to-interrupt cause is through the ivar registers. for more information, see section 7.3.3.1 and section 7.3.3.2 . note: eitr register and interrupt mechanism is not reset by device reset ( ctrl.dev_rst ). occurrence of device reset interrupt causes immediate generation of all pending interrupts. reserved 29 0b tcp timer 30 0b clears the mask for tcp timer interrupt. reserved 31 0000b reserved. field bit(s) initial value description iam_value 31:0 0b an icr read or write will have the side effect of writing the contents of this register to the imc register. if gpie.nsicr = 0, then the copy of this register to ims will occur only if at least one bit is set in the ims and there is a true interrupt as reflected in icr.inta. field bit(s) initial value description reserved 1:0 0x0 reserved interval 14:2 0x0 minimum inter-interrupt interval. the interval is specified in 1 ? s increments. a zero disables interrupt throttling logic. lli_en 15 0b lli moderation enable. ll counter (rws) 20:16 0x0 reflects the current credits for that eitr for ll interrupts. if the cnt_ingr is not set this counter can be directly written by software at any time to alter the throttles performance moderatio n counter (rws) 30:21 0x0 down counter, exposes only the 10 most significant bits of the real 12-bit counter. loaded with interval value whenever the associated interrupt is signaled. counts down to 0 and stops. the associated interrupt is signaled whenever this counter is zero and an associated (via the interrupt select register) eicr bit is set. if the cnt_ingr is not set this counter can be directly written by software at any time to alter the throttles performance. cnt_ingr (wo) 31 0b when set the hardware does not override the counters fields (itr counter and lli credit counter), so they keep their previous value. relevant for the current write only and is always read as zero field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 511 8.8.13 interrupt vector allocation registers - ivar (0x1700 + 4*n [n=0...7]; rw) these registers have two modes of operation: 1. in msi-x mode these registers define the allocation of the different interrupt causes as defined in table 7-43 to one of the msi-x vectors. each int_alloc[i] (i=0?31) field is a byte indexing an entry in the msi-x table structure and msi-x pba structure. 2. in non msi-x mode these registers define the allocation of the rx and tx queues interrupt causes to one of the rxtxq bits in the eicr. each int_alloc[i] (i=0?31) field is a byte indexing the appropriate rxtxq bit as defined in table 7-42 . note: if invalid values are written to the int_alloc fields the result is unexpected. field bit(s) initial value description int_alloc[0] 4:0 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry, as defined in table 7-43 . valid values are 0 to 24 for msi-x mode and 0 to 15 in non msi-x mode. reserved 6:5 00b reserved. int_alloc_val[0] 7 0b valid bit for int_alloc[0]. int_alloc[1] 12:8 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry, as defined in table 7-43 . valid values are 0 to 24 for msi-x mode and 0 to 15 in non msi-x mode. reserved 14:1 3 00b reserved. int_alloc_val[1] 15 0b valid bit for int_alloc[1]. int_alloc[2] 20:1 6 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry, as defined in table 7-43 . valid values are 0 to 24 for msi-x mode and 0 to 15 in non msi-x mode. reserved 22:2 1 00b reserved int_alloc_val[2] 23 0b valid bit for int_alloc[2] int_alloc[3] 28:2 4 0x0 defines the msi-x vector assigned to the interrupt cause associated with this entry, as defined in table 7-43 . valid values are 0 to 24 for msi-x mode and 0 to 15 in non msi-x mode. reserved 30:2 9 00b reserved int_alloc_val[3] 31 0b valid bit for int_alloc[3] dw 31 24 23 16 15 8 7 0 0 int_alloc[3] int_alloc[2] int_alloc[1] int_alloc[0] 1 ?? . . . 6 int_alloc[31] int_alloc[30] int_alloc[29] int_alloc[28]
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 512 8.8.14 interrupt vector allocation registers - misc ivar_misc (0x1740; rw) this register is used only in msi-x mode. this register defines the allocation of the ?other? and tcp timer interrupt causes to one of the msi-x vectors. 8.8.15 general purpose interrupt enable - gpie (0x1514; rw) field bit(s) initial value description int_alloc[32 ] 4:0 0x0 defines the msi-x vector assigned to the tcp timer interrupt cause. valid values are 0 to 24. reserved 6:5 00b reserved. int_alloc_va l[32] 7 0b valid bit for int_alloc[32] int_alloc[33 ] 12:8 0x0 defines the msi-x vector assigned to the ?other? interrupt cause. valid values are 0 to 24. reserved 14:13 00b reserved. int_alloc_va l[33] 15 0b valid bit for int_alloc[33] reserved 31:16 0x0 reserved. field bit(s) initial value description nsicr 0 0b non selective interrupt clear on read: when set, every read of icr clears it. when this bit is cleared, an icr read causes it to be cleared only if an actual interrupt was asserted or ims = 0b. reserved 3:1 0x0 reserved. multiple msix 4 0b 0 = on-msix, or msi-x with single vector, ivar map rx/tx causes to 16 eicr bits, but msix[0] is asserted for all. 1 = msix mode, ivar maps rx/tx causes to 25 msi-x vectors reflected in the first 25 bits of eicr. reserved 5 0b reserved reserved 6 0b reserved. ll interval 11:7 0x0 low latency credits increment rate. the interval is specified in 4 ? s increments. a value of 0x0 disables moderation of lli for all interrupt vectors. reserved 29:12 0x0 reserved. eiame 30 0b extended interrupt auto mask enable. when set (usually in msi-x mode); upon firing of an msi-x message, bits set in eiam associated with this message is cleared. otherwise, eiam is used only upon read or write of eicr/eics registers. pba_ support 31 0b pba support: when set, setting one of the extended interrupts masks via eims causes the pba bit of the associated msi-x vector to be cleared. otherwise, the 82576 behaves in a way supporting legacy int-x interrupts. note: should be cleared when working in int-x or msi mode and set in msi-x mode.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 513 8.9 msi-x table register descriptions these registers are used to configure the msi-x mechanism. the address and upper address registers sets the address for each of the vectors. the message register sets the data sent to the relevant address. the vector control registers are used to enable specific vectors. the pending bit array register indicates which vectors have pending interrupts. the structure is listed in table 8-19 . note: n = 25. note: n = 25. as a result, only qword0 is implemented. 8.9.1 msi?x table entry lower address - msixtadd (bar3: 0x0000 + 0x10*n [n=0...24]; r/w) table 8-19. msi-x table structure dword3 dword2 dword1 dword0 vector control msg data msg upper addr msg addr entry 0 base vector control msg data msg upper addr msg addr entry 1 base + 1*16 vector control msg data msg upper addr msg addr entry 2 base + 2*16 ? ? ? ? ? vector control msg data msg upper addr msg addr entry (n-1) base + (n-1) *16 table 8-20. msi-x pba structure 63:0 pending bits 0 through 63 qword0 base pending bits 64 through 127 qword1 base+1*8 ??? pending bits ((n-1) div 64)*64 through n-1 qword((n-1) div 64) base + ((n-1) div 64)*8 field bit(s) initial value description message address lsb (ro) 1:0 0x0 for proper dword alignment, software must always write 0b?s to these two bits. otherwise, the result is undefined. message address 31:2 0x0 system-specific message lower address for msi-x messages, the contents of this field from an msi-x table entry specifies the lower portion of the dword-aligned address for the memory write transaction.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 514 8.9.2 msi?x table entry upper address - msixtuadd (bar3: 0x0004 + 0x10*n [n=0...24]; r/w) 8.9.3 msi?x table entry message - msixtmsg (bar3: 0x0008 + 0x10*n [n=0...24]; r/w) 8.9.4 msi?x table entry vector control - msixtvctrl (bar3: 0x000c + 0x10*n [n=0...24]; r/w) 8.9.5 msixpba bit description ? msixpba (bar3: 0x02000; ro) field bit(s) initial value description message address 31:0 0x0 system-specific message upper address. field bit(s) initial value description message data 31:0 0x0 system-specific message data. for msi-x messages, the contents of this field from an msi-x table entry specifies the data written during the memory write transaction. in contrast to message data used for msi messages, the low- order message data bits in msi-x messages are not modified by the function. field bit(s) initial value description mask 0 1b when this bit is set, the function is prohibited from sending a message using this msi-x table entry. however, any other msi-x table entries programmed with the same vector are still capable of sending an equivalent message unless they are also masked. reserved 31:1 0x0 reserved field bit(s) initial value description pending bits 24:0 0x0 for each pending bit that is set, the function has a pending message for the associated msi-x table entry. pending bits that have no associated msi-x table entry are reserved. reserved 31:25 0x0 reserved
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 515 8.9.6 msi-x pba clear ? pbacl (0x05b68; r/w1c) 8.10 receive register descriptions 8.10.1 receive control register - rctl (0x00100; r/w) this register controls all the 82576 receiver functions. field bit(s) initial value description penbit 24:0 0x0 msi-x pending bits clear. writing a 1b to any bit clears the corresponding msixpba bit; writing a 0b has no effect. reading this register returns the pba vector. reserved 31:25 0x0 reserved. field bit(s) initial value description reserved 0 0b reserved. write to 0b for future compatibility. rxen 1 0b receiver enable. the receiver is enabled when this bit is set to 1b. writing this bit to 0b stops reception after receipt of any in progress packet. all subsequent packets are then immediately dropped until this bit is set to 1b. sbp 2 0b store bad packets. 0b = do not store. 1b = store bad packets. this bit controls the mac receive behavior. a packet is required to pass the address (or normal) filtering before the sbp bit becomes effective. if sbp = 0b, then all packets with layer 1 or 2 errors are rejected. the appropriate statistic would be incremented. if sbp = 1b, then these packets are received (and transferred to host memory). the receive descriptor error field (rdesc.errors) should have the corresponding bit(s) set to signal the software device driver that the packet is erred. in some operating systems the software device driver passes this information to the protocol stack. in either case, if a packet only has layer 3+ errors, such as ip or tcp checksum errors, and passes other filters, the packet is always received (layer 3+ errors are not used as a packet filter). note: symbol errors before the sfd are ignored. any packet must have a valid sfd (rx_dv with no rx_er in 10/100/ 1000base-t mode) in order to be recognized by the 82576 (even bad packets). also, erred packets are not routed to the mng even if this bit is set. upe 3 0b unicast promiscuous enabled. 0b = disabled. 1b = enabled. mpe 4 0b multicast promiscuous enabled. 0b = disabled. 1b = enabled.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 516 lpe 5 0b long packet reception enable 0b = disabled. 1b = enabled. lpe controls whether long packet reception is permitted. if lpe is 0b, hardware discards long packets over 1518, 1522 or 1526 bytes depending on the ctrl_ext.ext_vlan bit and the detection of a vlan tag in the packet. if lpe is 1b, the maximum packet size that the device can receive is defined in the rlpml.rlpml register (up to 9.5k). lbm 7:6 00b loopback mode. controls the loopback mode of the 82576. 00b = normal operation (or phy loopback in 10/100/ 1000base-t mode). 01b = mac loopback (test mode). 10b = undefined. 11b = reserved. when using the internal phy, lbm should remain set to 00b and the phy instead configured for loopback through the mdio interface. note: phy devices require programming for loopback operation using mdio accesses. reserved 9:8 00b reserved. reserved 11:10 00b reserved. set to 0b for compatibility. mo 13:12 00b multicast offset. determines which bits of the incoming multicast address are used in looking up the bit vector. 00b = bits [47:36] of received destination multicast address. 01b = bits [46:35] of received destination multicast address. 10b = bits [45:34] of received destination multicast address. 11b = bits [43:32] of received destination multicast address. reserved 14 0b reserved. bam 15 0b broadcast accept mode. 0b = ignore broadcast (unless it matches through exact or imperfect filters). 1b = accept broadcast packets. bsize 17:16 00b receive buffer size. bsize controls the size of the receive buffers and permits software to trade-off descriptor performance versus required storage space. buffers that are 2048 bytes require only one descriptor per receive packet maximizing descriptor efficiency. 00b = 2048 bytes. 01b = 1024 bytes. 10b = 512 bytes. 11b = 256 bytes. note: bsize is not modified when rxen is set to 1b. set rxen =0 when modifying the buffer size (changing these bits). field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 517 vfe 18 0b vlan filter enable 0b = disabled (filter table does not decide packet acceptance). 1b = enabled (filter table decides packet acceptance for 802.1q packets). three bits [20:18] control the vlan filter table. the first determines whether the table participates in the packet acceptance criteria. the next two are used to decide whether the cfi bit found in the 802.1q packet should be used as part of the acceptance criteria. cfien 19 0b canonical form indicator enable 0b = disabled (cfi bit found in received 802.1q packet?s tag is not compared to decide packet acceptance). 1b = enabled (cfi bit found in received 802.1q packet?s tag must match rctl.cfi to accept 802.1q type packet. cfi 20 0b canonical form indicator bit value 0b = 802.1q packets with cfi equal to this field are accepted. 1b = 802.1q packet is discarded. psp 21 0b pad small receive packets. if this field is set, secrc should be set also. dpf 22 0b discard pause frames with station mac address. controls whether pause frames directly addressed to this station are forwarded to the host. 0b = incoming pause frames with station mac address are forwarded to the host. 1b = incoming pause frames with station mac address are discarded. note: pause frames with other mac addresses (multicast address) are always discarded unless the specific address is added to the accepted mac addresses (either multicast or unicast). pmcf 23 0b pass mac control frames. filters out unrecognized pause and other control frames. 0b = pass/forward pause frames. 1b = filter pause frames (default). pmcf controls the dma function of mac control frames (other than flow control). a mac control frame in this context must be addressed to either the mac control frame multicast address or the station address, match the type field, and not match the pause opcode of 0x0001. if pmcf = 1b then frames meeting this criteria are transferred to host memory. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 518 8.10.2 split and replication receive control - srrctl (0x0c00c + 0x40*n [n=0...15]; r/w) reserved 25:24 0b reserved. should be written with 0b to ensure future compatibility. secrc 26 0b strip ethernet crc from incoming packet. causes the crc to be stripped from all packets. 0b = does not strip crc 1b = strips crc. this bit controls whether the hardware strips the ethernet crc from the received packet. this stripping occurs prior to any checksum calculations. the stripped crc is not transferred to host memory and is not included in the length reported in the descriptor. reserved 31:27 0x0 reserved. should be written with 0b to ensure future compatibility. field bit(s) initial value description bsizepacke t 6:0 0x0 receive buffer size for packet buffer. the value is in 1 kb resolution. valid values can be from 1 kb to 127 kb. default buffer size is 0 kb. if this field is equal 0b, then rctl.bsize determines the packet buffer size. reserved 7 0x0 reserved bsizeheade r 11:8 0x4 receive buffer size for header buffer. the value is in 64 bytes resolution. valid value scan be from 64 bytes to 960 bytes . default buffer size is 256 bytes. this field must be greater than 0 if the value of desctype is greater or equal to 2. reserved 13:12 00b reserved. must be set to 00b. reserved 19:14 0x0 reserved. rdmts 24:20 0x0 receive descriptor minimum threshold size. a low latency interrupt (lli) associated with this queue is asserted whenever the number of free descriptors becomes equal to rdmts multiplied by 16. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 519 8.10.3 packet split receive type - psrtype (0x05480 + 4*n [n=0...7]; r/w) this register enables or disables each type of header that needs to be split. each register controls the behavior of 2 queues. ? packet split receive type register (queue 0-1) - psrtype0 (0x05480) ? packet split receive type register (queue 2-3) - psrtype1 (0x05484) ? packet split receive type register (queue 4-5) - psrtype2 (0x05488) ? packet split receive type register (queue 6-7) - psrtype3 (0x0548c) ? packet split receive type register (queue 8-9) - psrtype4 (0x05490) ? packet split receive type register (queue 10-11) - psrtype5 (0x05494) ? packet split receive type register (queue 12-13) - psrtype6 (0x05498) ? packet split receive type register (queue 14-15) - psrtype7 (0x0549c) desctype 27:25 000b defines the descriptor in rx. 000b = legacy. 001b = advanced descriptor one buffer. 010b = advanced descriptor header splitting. 011b = advanced descriptor header replication - replicate always. 100b = advanced descriptor header replication large packet only (larger than header buffer size). 101b = reserved. 111b = reserved. reserved 30:28 0x0 reserved. drop_en 31 0b/1b drop enabled. if set, packets received to the queue when no descriptors are available to store them are dropped. the packet is dropped only if there are not enough free descriptors in the host descriptor ring to store the packet. if there are enough descriptors in the host, but they are not yet fetched by the 82576, then the packet is not dropped and there are no release of packets until the descriptors are fetched. default is 0b for queue 0 and 1b for the other queues. field bit(s) initial value description reserved 0 0b reserved. psr_type1 1 1b header includes mac, (vlan/snap) ipv4 only. psr_type2 2 1b header includes mac, (vlan/snap) ipv4, tcp only. psr_type3 3 1b header includes mac, (vlan/snap) ipv4, udp only. psr_type4 4 1b header includes mac, (vlan/snap) ipv4, ipv6 only. psr_type5 5 1b header includes mac, (vlan/snap) ipv4, ipv6, tcp only. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 520 8.10.4 replicated packet split receive type - rplpsrtype (0x054c0; r/w) this register enables or disables each type of header that needs to be split. this register controls the behavior of replicated packets. psr_type6 6 1b header includes mac, (vlan/snap) ipv4, ipv6, udp only. psr_type7 7 1b header includes mac, (vlan/snap) ipv6 only. psr_type8 8 1b header includes mac, (vlan/snap) ipv6, tcp only. psr_type9 9 1b header includes mac, (vlan/snap) ipv6, udp only. reserved 10 1b reserved. psr_type11 11 1b header includes mac, (vlan/snap) ipv4, tcp, nfs only. psr_type12 12 1b header includes mac, (vlan/snap) ipv4, udp, nfs only. reserved 13 1b reserved. psr_type14 14 1b header includes mac, (vlan/snap) ipv4, ipv6, tcp, nfs only. psr_type15 15 1b header includes mac, (vlan/snap) ipv4, ipv6, udp, nfs only. reserved 16 1b reserved. psr_type17 17 1b header includes mac, (vlan/snap) ipv6, tcp, nfs only. psr_type18 18 1b header includes mac, (vlan/snap) ipv6, udp, nfs only. reserved 31:19 0x0 reserved. field bit(s) initial value description reserved 0 0b reserved. psr_type1 1 1b header includes mac, (vlan/snap) ipv4 only. psr_type2 2 1b header includes mac, (vlan/snap) ipv4, tcp only. psr_type3 3 1b header includes mac, (vlan/snap) ipv4, udp only. psr_type4 4 1b header includes mac, (vlan/snap) ipv4, ipv6 only. psr_type5 5 1b header includes mac, (vlan/snap) ipv4, ipv6, tcp only. psr_type6 6 1b header includes mac, (vlan/snap) ipv4, ipv6, udp only. psr_type7 7 1b header includes mac, (vlan/snap) ipv6 only. psr_type8 8 1b header includes mac, (vlan/snap) ipv6, tcp only. psr_type9 9 1b header includes mac, (vlan/snap) ipv6, udp only. reserved 10 1b reserved. psr_type11 11 1b header includes mac, (vlan/snap) ipv4, tcp, nfs only. psr_type12 12 1b header includes mac, (vlan/snap) ipv4, udp, nfs only. reserved 13 1b reserved. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 521 8.10.5 receive descriptor base address low - rdbal (0x0c000 + 0x40*n [n=0...15]; r/w) this register contains the lower bits of the 64-bit descriptor base address. the lower four bits are always ignored. the receive descriptor base address must point to a 128 byte-aligned block of data. note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x2800, 0x2900, 0x2a00 & 0x2b00 respectively. 8.10.6 receive descriptor base address high - rdbah (0x0c004 + 0x40*n [n=0...15]; r/w) this register contains the upper 32 bits of the 64-bit descriptor base address. note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x2804, 0x2904, 0x2a04 & 0x2b04 respectively. 8.10.7 receive descriptor ring length - rdlen (0x0c008 + 0x40*n [n=0...15]; r/w) this register sets the number of bytes allocated for descriptors in the circular descriptor buffer. it must be 128-byte aligned. psr_type14 14 1b header includes mac, (vlan/snap) ipv4, ipv6, tcp, nfs only. psr_type15 15 1b header includes mac, (vlan/snap) ipv4, ipv6, udp, nfs only. reserved 16 1b reserved. psr_type17 17 1b header includes mac, (vlan/snap) ipv6, tcp, nfs only. psr_type18 18 1b header includes mac, (vlan/snap) ipv6, udp, nfs only. reserved 31:19 0x0 reserved. field bit(s) initial value description 0 6:0 0x0 ignored on writes. returns 0b on reads. rdbal 31:7 x receive descriptor base address low. field bit(s) initial value description rdbah 31:0 x receive descriptor base address [63:32]. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 522 note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x2808, 0x2908, 0x2a08 & 0x2b08 respectively. 8.10.8 receive descriptor head - rdh (0x0c010 + 0x40*n [n=0...15]; ro) the value in this register might point to descriptors that are still not in host memory. as a result, the host cannot rely on this value in order to determine which descriptor to process. note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x2810, 0x2910, 0x2a10 & 0x2b10 respectively. 8.10.9 receive descriptor tail - rdt (0x0c018 + 0x40*n [n=0...15]; r/w) this register contains the tail pointers for the receive descriptor buffer. the register points to a 16-byte datum. software writes the tail register to add receive descriptors to the hardware free list for the ring. note: writing the rdt register while the corresponding queue is disabled is ignored by the 82576. in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x2818, 0x2918, 0x2a18& 0x2b18 respectively. field bit(s) initial value description len 19:0 0x0 descriptor ring length (in bytes). bits 6:0 must be set to zero. bits 3:0 always reads as zero. the maximum allowed value is 0x80000 (32k descriptors). reserved 31:20 0x0 reserved. reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description rdh 15:0 0x0 receive descriptor head. reserved 31:16 0x0 reserved. field bit(s) initial value description rdt 15:0 0x0 receive descriptor tail. reserved 31:16 0x0 reserved. reads as 0b. should be written to 0b for future compatibility.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 523 8.10.10 receive descriptor control - rxdctl (0x0c028 + 0x40*n [n=0...15]; r/w) this register controls the fetching and write-back of receive descriptors. the three threshold values are used to determine when descriptors are read from and written to host memory. the values are in units of descriptors (each descriptor is 16 bytes). field bit(s) initial value description pthresh 4:0 0x0 prefetch threshold pthresh is used to control when a prefetch of descriptors is considered. this threshold refers to the number of valid, unprocessed receive descriptors the 82576 has in its on-chip buffer. if this number drops below pthresh, the algorithm considers pre-fetching descriptors from host memory. this fetch does not happen unless there are at least hthresh valid descriptors in host memory to fetch. note: hthresh should be given a non zero value each time pthresh is used. possible values for this field are 0 to 16. reserved 7:5 0x0 reserved. hthresh 12:8 0x0 host threshold. possible values for this field are 0 to 16. reserved 15:13 0x0 reserved. wthresh 20:16 0x01 write-back threshold. wthresh controls the write-back of processed receive descriptors. this threshold refers to the number of receive descriptors in the on-chip buffer that are ready to be written back to host memory. in the absence of external events (explicit flushes), the write-back occurs only after at least wthresh descriptors are available for write-back. possible values for this field are 0 to 31: note: since the default value for write-back threshold is 1b, the descriptors are normally written back as soon as one cache line is available. wthresh must contain a non-zero value to take advantage of the write-back bursting capabilities of the 82576. reserved 24:21 0x0 reserved. enable 25 1b/0b receive queue enable. when set, the enable bit enables the operation of the specific receive queue. 1b =enables queue. 0b =disables queue. default value for q0 is 1b. default value for q15:1 is 0b. after a vf flr to vf0, q0 is also reset to zero. setting this bit initializes all internal registers of the specific queue. until then, the state of the queue is kept and can be used for debug purposes. when disabling a queue, this bit is cleared only after all activity in the queue has stopped. note: this bit is valid only if the queue is actually enabled, thus if rctl.rxen is cleared, this bit remains zero.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 524 note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x2828, 0x2928, 0x2a28 & 0x2b28 respectively. 8.10.11 receive queue drop packet count - rqdpc (0xc030 + 0x40*n [n=0...15]; rc) note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x2830, 0x2930, 0x2a30 & 0x2b30 respectively. packets dropped due to the queue being disabled may not be counted by this register. 8.10.12 dma rx max outstanding data - drxmxod (0x2540; rw) this register limits the total number of data bytes that might be in the write pipe to the host memory. this allows received low latency packets to be serviced in a timely manner, as this limits the amount of data to be processed before the low latency packet is handled. swflush (wc) 26 0b receive software flush. enables software to trigger receive descriptor write-back flushing, independently of other conditions. this bit is cleared by hardware. reserved 27 0x00 reserved. reserved 28 0 reserved. reserved 29 0 reserved. reserved 31:30 0x00 reserved. field bit(s) initial value description rqdpc 11:0 0x0 receive queue drop packet count. counts the number of packets dropped by a queue due to lack of descriptors available. reserved 31:12 0x0 reserved. field bit(s) initial value description max_bytes_n um_req 11:0 0x10 max allowed number of bytes requests. the maximum size of the data in the write pipe (resolution is 256 bytes). if the total size is higher than the amount in the field no arbitration is done and no new packet is sent. reserved 31:12 0x0 reserved. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 525 8.10.13 receive checksum control - rxcsum (0x05000; r/w) the receive checksum control register controls the receive checksum off loading features of the 82576. the 82576 supports the off loading of three receive checksum calculations: the packet checksum, the ip header checksum, and the tcp/udp checksum. note: this register should only be initialized (written) when the receiver is not enabled (only write this register when rctl.rxen = 0b) field bit(s) initial value description pcss 7:0 0x0 packet checksum star. controls the packet checksum calculation. the packet checksum shares the same location as the rss field and is reported in the receive descriptor when the rxcsum.pcsd bit is cleared. if rxcsum.ippcse is cleared (the default value), the checksum calculation that is reported in the rx packet checksum field is the unadjusted 16-bit ones complement of the packet. the packet checksum starts from the byte indicated by rxcsum.pcss (0b corresponds to the first byte of the packet), after vlan stripping if enabled by the ctrl.vme. for example, for an ethernet ii frame encapsulated as an 802.3ac vlan packet and with rxcsum.pcss set to 14, the packet checksum would include the entire encapsulated frame, excluding the 14-byte ethernet header (da, sa, type/length) and the 4-byte vlan tag. the packet checksum does not include the ethernet crc if the rctl.secrc bit is set. software must make the required offsetting computation (to back out the bytes that should not have been included and to include the pseudo-header) prior to comparing the packet checksum against the l4 checksum stored in the packet checksum. the partial checksum in the descriptor is aimed to accelerate checksum calculation of fragmented udp packets. if rxcsum.ippcse is set, the packet checksum is aimed to accelerate checksum calculation of fragmented udp packets. see also: section 7.1.10.2 . note: the pcss value should not exceed a pointer to the ip header start. if exceeded, the ip header checksum or tcp/ udp checksum is not calculated correctly. ipofld 8 1b ip checksum off-load enable rxcsum.ipofld is used to enable the ip checksum off- loading feature. if rxcsum.ipofld is set to 1b, the 82576 calculates the ip checksum and indicates a pass/fail indication to software via the ip checksum error bit (ipe) in the error field of the receive descriptor. similarly, if rxcsum.tuofld is set to 1b, the 82576 calculates the tcp or udp checksum and indicates a pass/fail indication to software via the tcp/udp checksum error bit (l4e). similarly, if rfctl.ipv6_dis and rfctl.ip6xsum_dis are cleared to 0b and rxcsum.tuofld is set to 1b, the 82576 calculates the tcp or udp checksum for ipv6 packets. it then indicates a pass/fail condition in the tcp/udp checksum error bit (rdesc.l4e). this applies to checksum off loading only. supported frame types: ? ethernet ii ? ethernet snap tuofld 9 1b tcp/udp checksum off-load enable. reserved 10 0b reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 526 8.10.14 receive long packet maximum length - rlpml (0x5004; r/ w) 8.10.15 receive filter control register - rfctl (0x05008; r/w) crcofl 11 0b crc32 offload enable. enables the crc32 checksum off-loading feature. if rxcsum.crcofl is set to 1b, the 82576 calculates the crc32 checksum and indicates a pass/fail indication to software via the crc32 checksum valid bit (crcv) in the extended status field of the receive descriptor. in non i/oat, this bit is read only as 0b. ippcse 12 0b ip payload checksum enable. see rxcsum.pcss description (above). pcsd 13 0b packet checksum disable. the packet checksum and ip identification fields are mutually exclusive with the rss hash. only one of the two options is reported in the rx descriptor. rxcsum.pcsd legacy rx descriptor (srrctl.desctype ! ?? 000b): 0b (checksum enable) - packet checksum is reported in the rx descriptor. 1b (checksum disable) - not supported. rxcsum.pcsd extended or header split rx descriptor (srrctl.desctype != 000b): 0b (checksum enable) - checksum and ip identification are reported in the rx descriptor. 1b (checksum disable) - rss hash value is reported in the rx descriptor. reserved 31:14 0x0 reserved. field bit(s) initial value description rlpml 13:0 0x2600 maximum allowed long packet length. this length is the global length of the packet including all the potential headers of suffixes in the packet. reserved 31:14 0x0 reserved. field bit(s) initial value description reserved 5:0 1b reserved. nfsw_dis 6 0b nfs write disable. disables filtering of nfs write request headers. nfsr_dis 7 0b nfs read disable. disables filtering of nfs read reply headers.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 527 8.10.16 multicast table array - mta (0x05200 + 4*n [n=0...127]; r/w) there is one register per 32 bits of the multicast address table for a total of 128 registers. software must mask to the desired bit on reads and supply a 32-bit word on writes. the first bit of the address used to access the table is set according to the rx_ctrl.mo field. note: all accesses to this table must be 32 bit. figure 8-1 shows the multicast lookup algorithm. the destination address shown represents the internally stored ordering of the received da. note that bit 0 indicated in this diagram is the first on the wire. nfs_ver 9:8 00b nfs version. 00b = nfs version 2. 01b = nfs version 3. 10b = nfs version 4. 11b = reserved for future use. ipv6_dis 10 0b ipv6 disable. disables ipv6 packet filtering. any received ipv6 packet is parsed only as an l2 packet. ipv6xsum_d is 11 0b ipv6 xsum disable. disables xsum on ipv6 packets. reserved 12 0b reserved. reserved 13 0b reserved (was ack accelerate disable & ack data disable). ipfrsp_dis 14 0b ip fragment split disable. when this bit is set, the header of ip fragmented packets are not set. reserved 15 0b reserved. reserved 17:16 00b reserved. must be set to 00b. lef 18 0b forward length error packet 0b = packet with length error are dropped. 1b = packets with length error are forwarded to the host. synqfp 19 0b defines the priority between synqf & 5 tuples filter 0b = 5-tuple filter priority 1b = syn filter priority. reserved 31:20 0x08 reserved. should be written with 0b to ensure future capability. field bit(s) initial value description bit vector 31:0 x word wide bit vector specifying 32 bits in the multicast address filter table.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 528 8.10.17 receive address low - ral (0x05400 + 8*n [n=0...15]; 0x054e0 + 8*n [n=0...7]; r/w) while ?n? is the exact unicast/multicast address entry and it is equal to 0,1,...15. these registers contain the lower bits of the 48 bit ethernet address. all 32 bits are valid. these registers are reset by a software reset or platform reset. if an eeprom is present, the first register (ral0) is loaded from the eeprom after a software or platform reset. note: the ral field should be written in network order. figure 8-1. multicast table array field bit(s) initial value description ral 31:0 x receive address low. contains the lower 32-bit of the 48-bit ethernet address.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 529 8.10.18 receive address high - rah (0x05404 + 8*n [n=0...15]; 0x054e4 + 8*n [n=0...7]; r/w) these registers contain the upper bits of the 48 bit ethernet address. the complete address is [rah, ral]. av determines whether this address is compared against the incoming packet and is cleared by a master reset. asel enables the 82576 to perform special filtering on receive packets. after reset, if an eeprom is present, the first register (receive address register 0) is loaded from the ia field in the eeprom with its address select field set to 00b and its address valid field set to 1b. if no eeprom is present, the address valid field is set to 0b and the address valid field for all of the other registers is set to 0b. note: the rah field should be written in network order. the first receive address register (rah0) is also used for exact match pause frame checking (da matches the first register). as a result, rah0 should always be used to store the individual ethernet mac address of the 82576. field bit(s) initial value description rah 15:0 x receive address high. contains the upper 16 bits of the 48-bit ethernet address. asel 17:16 x address selec.t selects how the address is to be used in the address filtering. 00b = destination address (required for normal mode) 01b = source address. this mode should not be used in virtualization mode. 10b = reserved 11b = reserved poolsel 25:18 0x0 pool select. in virtualization modes (mrqc.multiple receive queues enable = 011b - 101b) indicates which pool should get the packets matching this mac address. this field is a bit map (bit per vm) where more than one bit can be set according to the limitations defined in section 7.10.3.5 . if all the bits are zero, this address is used only for l2 filtering and is not used as part of the queueing decision. reserved 30:26 0b reserved. reads as 0b. ignored on writes. av 31 address valid. cleared after master reset. if an eeprom is present, the address valid field of the receive address register 0 is set to 1b after a software or pci reset or eeprom read. in entries 0-15 this bit is cleared by master reset.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 530 8.10.19 vlan filter table array - vfta (0x05600 + 4*n [n=0...127]; r/w) there is one register per 32 bits of the vlan filter table. the size of the word array depends on the number of bits implemented in the vlan filter table. software must mask to the desired bit on reads and supply a 32-bit word on writes. note: all accesses to this table must be 32 bit. the algorithm for vlan filtering using the vfta is identical to that used for the multicast table array. refer to section 8.10.16 for a block diagram of the algorithm. if vlans are not used, there is no need to initialize the vfta. field bit(s) initial value description bit vector 31:0 x double-word wide bit vector specifying 32 bits in the vlan filter table.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 531 8.10.20 multiple receive queues command register - mrqc (0x05818; r/w) field bit(s) initial value description multiple receive queues enable 2:0 0x0 multiple receive queues enable. enables support for multiple receive queues and defines the mechanism that controls queue allocation. 000b = multiple receive queues are disabled. 001b = reserved. 010b = multiple receive queues as defined by rss for sixteen queues 1 . 011b = multiple receive queues as defined by next generation vmdq based on packet destination mac address. in this case, all the packets are forwarded to queue zero of each pool. 100b = multiple receive queues as defined by next generation vmdq based on packet destination mac address. 101b = multiple receive queues as defined by next generation vmdq based on packet destination mac address and rss 1 . 110b = multiple receive queues as defined by rss 1 . 111b = reserved. if vt is not supported, the only allowed values for this field are 000b, 001b, 010b and 110b. writing any other value is ignored. the only allowed values for this field are 000b, 001b, 011b, and 101b. writing any other value is ignored.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 532 note: mrqc_en is used to enable/disable rss hashing and also to enable multiple receive queues. disabling this feature is not recommended. model usage is to reset the 82576 after disabling the rss. 8.10.21 rss random key register - rssrk (0x05c80 + 4*n [n=0...9]; r/w) the rss random key register stores a 40 byte key used by the rss hash function. def_q 6:3 0x0 defines default queue in non next generation vmdq modes. if multiple receive queues enable: 000b = defines the destination of all packets 001b = bits 5:3 defines the lsb of the queue number 010b = defines the destination of all packets not forwarded by rss 010b - 101b = this field is ignored. 110b = bits 5:3 defines the lsb of the queue number of all packets not forwarded by rss. reserved 15:7 0x0 reserved. rss field enable 31:16 0x0 each bit, when set, enables a specific field selection to be used by the hash function. several bits can be set at the same time. bit[16] = enable tcpipv4 hash function bit[17] = enable ipv4 hash function bit[18] = enable tcpipv6ex hash function bit[19] = enable ipv6ex hash function bit[20] = enable ipv6 hash function bit[21] = enable tcpipv6 hash function bit[22] = enable udpipv4 bit[23] = enable udpipv6 bit[24] = enable udpipv6ext bit[25] = reserved bits[31:26] = reserved zero 1. note that the rxcsum.pcsd bit should be set to enable reception of the rss hash value in the receive descriptor. field bit(s) initial value description k0 7:0 0x0 byte n*4 of the rss random key (n=0,1,...9). k1 15:8 0x0 byte n*4+1 of the rss random key (n=0,1,...9). k2 23:16 0x0 byte n*4+2 of the rss random key (n=0,1,...9). k3 31:24 0x0 byte n*4+3 of the rss random key (n=0,1,...9).
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 533 8.10.22 redirection table - reta (0x05c00 + 4*n [n=0...31]; r/w) the redirection table is a 128-entry table with each entry being eight bits wide. only one to four bits of each entry are used to store the queue index. the table is configured through the following r/w registers. each entry (byte) of the redirection table contains the following: ? bits [7:4] - reserved ? bits [3:0] - queue index for all pools or in regular rss. in rss mode, all bits are used. in next generation vmdq + rss mode only bit 0 is used the contents of the redirection table are not defined following reset of the memory configuration registers. system software must initialize the table prior to enabling multiple receive queues. it can also update the redirection table during run time. such updates of the table are not synchronized with the arrival time of received packets. therefore, it is not guaranteed that a table update takes effect on a specific packet boundary. note: in case the operating system provides a redirection table whose size is smaller than 128 bytes, the software usually replicates the operating system-provided redirection table to span the whole 128 bytes of the hardware's redirection table. 31 24 23 16 15 8 7 0 k[3] k[2] k[1] k[0] . . . . . . . . . . . . k[39] . . . . . . k[36] field bit(s) initial value description entry 0 7:0 0x0 determines the tag value and physical queue for index 4*n+0 (n=0...31). entry 1 15:8 0x0 determines the tag value and physical queue for index 4*n+1 (n=0...31). entry 2 23:16 0x0 determines the tag value and physical queue for index 4*n+2 (n=0...31). entry 3 31:24 0x0 determines the tag value and physical queue for index 4*n+3 (n=0...31). 31 24 23 16 15 8 7 0 tag 3 tag 2 tag 1 tag 0 . . . . . . . . . . . . tag 127 . . . . . . . . . 7:4 3:0 reserved queue index
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 534 8.11 filtering register descriptions 8.11.1 immediate interrupt rx - imir (0x05a80 + 4*n [n=0...7]; r/w) this register defines the filtering that corrects which packet triggers low latency interrupt. another register includes a size threshold and a control bits bitmap to trigger an immediate interrupt. note: the port field should be written in network order. if one of the actions for this filter is set, then at least one of the port_bp, size_bp, one of mask bit or ctrlbit_bp bits should be cleared. field bit(s) initial value description destination port 15:0 0x0 destination tcp port. this field is compared with the destination tcp port in incoming packets. immediate interrupt 16 0b enables issuing an immediate interrupt when the following conditions are met: ? the 5-tuple filter associated with this register matches ? the length filter associated with this filter matches ? the tcp flags filter associated with this filter matches port_bp 17 x port bypass. when set to 1b, the tcp port check is bypassed and only other conditions are checked. when set to 0b, the tcp port is checked to fit the port field. reserved 28:18 0x0 reserved. filter priority 31:29 000b defines the priority of the filter assuming two filters matches. if two filter of the same priority matches the incoming packet, any of the highest priority filters can be chosen.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 535 8.11.2 immediate interrupt rx ext. - imirext (0x05aa0 + 4*n [n=0...7]; r/w) note: the size used for this comparison is the size of the packet as forwarded to the host and does not include any of the fields stripped by the mac (vlan or crc). as a result, setting the rctl.secrc & ctrl.vme bits should be taken into account while calculating the size threshold. the value of the imir and imirext registers after reset is unknown (apart from the imir.port_im_en bit which is guaranteed to be cleared). therefore, both registers should be programmed before imir.port_im_en is set for a given flow. 8.11.3 source address queue filter - saqf (0x5980 + 4*n[n=0...7]; rw) field bit(s) initial value description size_thresh 11:0 x size threshold. these 12 bits define a size threshold; a packet with a length below this threshold triggers an interrupt. enabled by size_thresh_en. size_bp 12 x size bypass. when 1b, the size check is bypassed. when 0b, the size check is performed. ctrlbit 18:13 x control bit. when a bit in this field equals 1b, an interrupt is immediately issued after receiving a packet with the corresponding tcp control bits turned on. bit 13 (urg): urgent pointer field significant bit 14 (ack): acknowledgment field bit 15 (psh): push function bit 16 (rst): reset the connection bit 17 (syn): synchronize sequence numbers bit 18 (fin): no more data from sender ctrlbit_bp 19 x control bits bypass when set to 1b, the control bits check is bypassed. when set to 0b, the control bits check is performed. reserved 31:20 0x0 reserved field bit(s) initial value description source address 31:0 0x0 ip source address, part of the 5-tuple queue filters.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 536 8.11.4 destination address queue filter - daqf (0x59a0 + 4*n[n=0...7]; rw) 8.11.5 source port queue filter - spqf (0x59c0 + 4*n[n=0...7]; rw) 8.11.6 5-tuple queue filter - ftqf (0x59e0 + 4*n[n=0...7]; rw) field bit(s) initial value description destination address 31:0 0x0 ip destination address, part of the 5-tuple queue filters. field bit(s) initial value description source port 15:0 0x0 tcp/udp source port, part of the 5-tuple queue filters. reserved 31:16 0x0 reserved. field bit(s) initial value description protocol 7:0 0x0 ip l4 protocol, part of the 5-tuple queue filters. queue enable 8 0b when set, enables filtering of rx packets by the 5-tuple defined in this filter to the queue indicated in this register. vf 11:9 0x0 the vf index of the vf associated with this filter. reserved 14:12 0 reserved (for extension of number of vfs). vf mask 15 1b (for legacy reasons) mask bit for the vf field. when set to 1b, the vf field is not compared as part of the 5-tuple filter. software can clear (activate) the pool mask bit only when operating in virtualization mode. rx queue 25:16 0x0 identifies the rx queue associated with this 5-tuple filter. if the vf mask bit is set, the queue number is used as an offset to the vm list and do not override it. reserved 26 0b reserved 1588 time stamp 27 0b when set, packets that match this filter are time stamped according to the ieee 1588 specification. mask 31:28 0xf (for legacy reasons) mask bits for the 5-tuple fields (the mask bit for destination port is in the imir register for legacy reasons). the corresponding field participates in the match if the bit below is cleared: bit 28 - mask protocol comparison bit 29 - mask source address comparison bit 30 - mask destination address comparison bit 31 - mask source port comparison
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 537 8.11.7 immediate interrupt rx vlan priority - imirvp (0x05ac0; r/w) 8.11.8 syn packet queue filter - synqf (0x55fc; rw) 8.11.9 etype queue filter - etqf (0x5cb0 + 4*n[n=0...7]; rw) field bit(s) initial value description vlan_pri 2:0 000b vlan priority. this field includes the vlan priority threshold. when vlan_pri_en is set to 1b, then an incoming packet with a vlan tag with a priority field equal or higher to vlanpri triggers an immediate interrupt, regardless of the eitr moderation. vlan_pri_en 3 0b vlan priority enable. when set to 1b, an incoming packet with vlan tag with a priority equal or higher to vlan_pri triggers an immediate interrupt, regardless of the eitr moderation. when set to 0b, the interrupt is moderated by eitr. reserved 31:4 0x0 reserved. field bit(s) initial value description queue enable 0 0b when set, enables forwarding of rx packets to the queue indicated in this register. rx queue 4:1 0x0 identifies an rx queue associated with syn packets. reserved 31:5 0x0 reserved. field bit(s) initial value description etype 15:0 0x0 identifies the protocol running on top of ieee 802. used to forward rx packets containing this etype to a specific rx queue. rx queue 19:16 0x0 identifies the rx queue associated with this etype. reserved 25:20 0x0 reserved. filter enable 26 0b when set, this filter is valid. any of the actions controlled by the following fields are gated by this field. reserved 28:27 0b reserved. bcn frame 28 0x0 when set, packets with this etype are parsed according to the bcn specification.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 538 8.12 transmit register descriptions 8.12.1 transmit control register - tctl (0x00400; r/w) this register controls all transmit functions for the 82576. software can choose to abort packet transmission in less than the ethernet mandated 16 collisions. for this reason, hardware provides ct. note: while 802.3x flow control is only defined during full duplex operation, the sending of pause frames via the swxoff bit is not gated by the duplex settings within the 82576. software should not write a 1b to this bit while the 82576 is configured for half-duplex operation. rtlc configures the 82576 to perform retransmission of packets when a late collision is detected. note that the collision window is speed dependent: 64 bytes for 10/100 mb/s and 512 bytes for 1000 mb/s operation. if a late collision is detected when this bit is disabled, the transmit function assumes the packet has successfully transmitted. this bit is ignored in full-duplex mode. immediate interrupt 29 0x0 when set, packets that match this filter generate an immediate interrupt. 1588 time stamp 30 0b when set, packets with this etype are time stamped according to the ieee 1588 specification. queue enable 31 0b when set, enables filtering of rx packets by the etype defined in this register to the queue indicated in this register. field bit(s) initial value description reserved 0 0b reserved. write as 0b for future compatibility. en 1 0b transmit enable the transmitter is enabled when this bit is set to 1b. writing 0b to this bit stops transmission after any in progress packets are sent. data remains in the transmit fifo until the device is re-enabled. software should combine this operation with reset if the packets in the tx fifo should be flushed. reserved 2 0b reserved. reads as 0b. should be written to 0b for future compatibility. psp 3 1b pad short packets. 0b = do not pad. 1b = pad. padding makes the packet 64 bytes long. this is not the same as the minimum collision distance. if padding of short packets is allowed, the total length of a packet not including fcs should be not less than 17 bytes.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 539 8.12.2 transmit control extended - tctl_ext (0x0404; r/w) this register controls late collision detection. cold is used to determine the latest time in which a collision indication is considered as a valid collision and not a late collision. when using the internal phy, the default value of 0x42 provides a behavior consistent with the 802.3 spec requested behavior. when using an sgmii connected phy, the sgmii adds some delay on top of the time budget allowed by the specification (collisions in valid network topographies even after 512 bit time can be expected). in order to accommodate this condition, cold should be updated to take the sgmii inbound and outbound delays. the delay induced by the 82576 is 16 bit time in 10 mb/s (add 2 to the cold field value) and 40 bit time in 100 mb/s (add 5 to the cold field value). any delay induced by the specific phy used should also be added. ct 11:4 0xf collision threshold. this determines the number of attempts at retransmission prior to giving up on the packet (not including the first transmission attempt). while this can be varied, it should be set to a value of 15 in order to comply with the ieee specification requiring a total of 16 attempts. the ethernet back-off algorithm is implemented and clamps to the maximum number of slot-times after 10 retries. this field only has meaning when in half-duplex operation. bst 21:12 0x40 back-off slot time. this value determines the back-off slot time value in byte time. swxoff 22 0b software xoff transmission. when set to 1b, the 82576 schedules the transmission of an xoff (pause) frame using the current value of the pause timer (fcttv.ttv). this bit self-clears upon transmission of the xoff frame. reserved 23 0b reserved. rtlc 24 0b re-transmit on late collision. when set, enables the 82576 to re-transmit on a late collision event. reserved 25 0b reserved. reserved 27:26 0x1 reserved. reserved 31:28 0xa reserved. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 540 8.12.3 transmit ipg register - tipg (0x0410; r/w) this register controls the inter packet gap (ipg) timer. field bit(s) initial value description reserved 9:0 0x40 reserved. cold 19:10 0x42 collision distance (in byte time). used to determine the latest time in which a collision indication is considered as a valid collision and not a late collision. reserved 31:20 0x0 reserved. field bit(s) initial value description ipgt 9:0 0x08 ipg back to back. specifies the ipg length for back to back transmissions in both full and half duplex. measured in increments of the mac clock: ? 8 ns mac clock when operating @ 1 gb/s. ? 80 ns mac clock when operating @ 100 mb/s. ? 800 ns mac clock when operating @ 10 mb/s. ipgt specifies the ipg length for back-to-back transmissions in both full duplex and half duplex. note that an offset of 4 byte times is added to the programmed value to determine the total ipg. as a result, a value of 8 is recommended to achieve a 12 byte time ipg.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 541 8.12.4 dma tx control - dtxctl (0x03590; r/w) this register is used to set some parameters controlling the dma tx behavior. ipgr1 19:10 0x04 ipg part 1. specifies the portion of the ipg in which the transmitter defers to receive events. ipgr1 should be set to 2/3 of the total effective ipg (8). measured in increments of the mac clock: ? 8 ns mac clock when operating @ 1 gb/s. ? 80 ns mac clock when operating @ 100 mb/s ? 800 ns mac clock when operating @ 10 mb/s. ipgr 29:20 0x06 ipg after deferral specifies the total ipg time for non back-to-back transmissions (transmission following deferral) in half duplex. measured in increments of the mac clock: ? 8 ns mac clock when operating @ 1 gb/s. ? 80 ns mac clock when operating @ 100 mb/s ? 800 ns mac clock when operating @ 10 mb/s. an offset of 5-byte times must be added to the programmed value to determine the total ipg after a defer event. a value of 7 is recommended to achieve a 12-byte effective ipg. note that the ipgr must never be set to a value greater than ipgt. if ipgr is set to a value equal to or larger that ipgt, it overrides the ipgt ipg setting in half duplex resulting in inter-packet gaps that are larger then intended by ipgt. in this case, full duplex is unaffected and always relies on ipgt. reserved 31:30 00b reserved. read as 0b. should be written with 0b for future compatibility. field bit(s) initial value description reserved 0 0b reserved. nosnoop_ls o_hdr_buf 1 0b nosnoop header buffer of tso packets. in tso packets, the header is fetched again for each segment sent. when this bit is set, the header buffer fetching for all segments, apart from the first one, sets no-snoop mode. when reset, all the header buffer fetching uses the attribute as in the dca registers. the first segment always uses the attribute as in the dca registers. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 542 8.12.5 dma tx tcp flags control low - dtxtcpflgl (0x359c; rw) this register holds the buses that ?and? the control flags in tcp header for the first and middle segments of a tso packet. see section 7.2.4.7.1 and section 7.2.4.7.2 for details on the use of this register. 8023ll 2 1b 802.3 length location. 1b = the location of the 802.3 length field in 802.3+snap packets, is assumed to be 8 bytes before the end of the mac header. 0b = the location of the 802.3 length field in 802.3+snap packets, is calculated from the beginning of the mac header assuming no vlan present in the packet sent by the software. this bit is used only in case of large send (tso) with snap mode. add vlan location 3 0b 1b = added by mac - means that loopbacked packets are without vlan. 0b = added by dma - means that loopbacked packets are with vlan. outofsyncen able 4 0b 0b = out of sync mechanism is disabled. 1b = out of sync mechanism is enabled. mdp_en 5 0b malicious driver protection enable. 0b = mechanism is disabled. 1b = mechanism is enabled. spoof_int 6 1 interrupt on spoof behavior detection. 0b = mechanism is disabled. 1b = mechanism is enabled. default cts tag 23:8 0 defines the cts tag to be used in case the cts index sent in the descriptor is not available. this field is protected from writes by the ctstxctl.txsgtlk bit. reserved 31:7 0x0 reserved. field bit(s) initial value description tcp_flg_first _seg 11:0 0xff6 tcp flags first segment. bits that make and operation with the tcp flags at tcp header in the first segment reserved 15:12 0x00 reserved. tcp_flg_mid _seg 27:16 0x76 tcp flags middle segments. the low bits that make and operation with the tcp flags at tcp header in the middle segments reserved 31:28 0x00 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 543 8.12.6 dma tx tcp flags control high - dtxtcpflgh (0x35a0; rw) this register holds the buses that ?and? the control flags in tcp header for the last segment of a tso packet. see section 7.2.4.7.3 for details of use of this register 8.12.7 dma tx max total allow size requests - dtxmxszrq (0x3540; rw) this register limits the total number of data bytes that might be in outstanding pcie requests from the host memory. this allows requests to send low latency packets to be serviced in a timely manner, as this request is serviced right after the current outstanding requests are completed. 8.12.8 transmit descriptor base address low - tdbal (0xe000 + 0x40*n [n=0...15]; r/w) these registers contain the lower 32 bits of the 64-bit descriptor base address. the lower 7 bits are ignored. the transmit descriptor base address must point to a 128-byte aligned block of data. note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x3800, 0x3900, 0x3a00 & 0x3b00 respectively. 8.12.9 transmit descriptor base address high - tdbah (0x0e004 + 0x40*n [n=0...15]; r/w) these registers contain the upper 32 bits of the 64-bit descriptor base address. field bit(s) initial value description tcp_flg_lst_ seg 11:0 0xf7f tcp flags last segment. bits that make and operation with the tcp flags at tcp header in the last segment reserved 31:12 0x00 reserved. field bit(s) initial value description max_bytes_n um_req 11:0 0x10 max allowed number of bytes requests. the maximum allowed amount of 256 bytes outstanding requests. if the total size request is higher than the amount in the field no arbitration is done and no new packet is requested. reserved 31:12 0x0 reserved. field bit(s) initial value description 0 6:0 0x0 ignored on writes. returns 0b on reads. tdbal 31:7 x transmit descriptor base address low.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 544 note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x3804, 0x3904, 0x3a04 & 0x3b04 respectively. 8.12.10 transmit descriptor ring length - tdlen (0x0e008 + 0x40*n [n=0...15]; r/w) these registers contain the descriptor ring length. the registers indicates the length in bytes and must be 128-byte aligned. note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x3808, 0x3908, 0x3a08 & 0x3b08 respectively. 8.12.11 transmit descriptor head - tdh (0x0e010 + 0x40*n [n=0...15]; ro) these registers contain the head pointer for the transmit descriptor ring. it points to a 16-byte datum. hardware controls this pointer. note: the values in these registers might point to descriptors that are still not in host memory. as a result, the host cannot rely on these values in order to determine which descriptor to release. note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x3810, 0x3910, 0x3a10 & 0x3b10 respectively. field bit(s) initial value description tdbah 31:0 x transmit descriptor base address [63:32]. field bit(s) initial value description reserved 6:0 0x0 must be set to zero. len 19:7 0x0 descriptor ring length (number of 8 descriptor sets). reserved 31:20 0x0 reserved. reads as 0b. should be written to 0b. field bit(s) initial value description tdh 15:0 0x0 transmit descriptor head. reserved 31:16 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 545 8.12.12 transmit descriptor tail - tdt (0x0e018 + 0x40*n [n=0...15]; r/w) these registers contain the tail pointer for the transmit descriptor ring and points to a 16-byte datum. software writes the tail pointer to add more descriptors to the transmit ready queue. hardware attempts to transmit all packets referenced by descriptors between head and tail. note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x3818, 0x3918, 0x3a18 & 0x3b18 respectively. 8.12.13 transmit descriptor control - txdctl (0x0e028 + 0x40*n [n=0...15]; r/w) these registers control the fetching and write-back of transmit descriptors. the three threshold values are used to determine when descriptors are read from and written to host memory. the values are in units of descriptors (each descriptor is 16 bytes). since write-back of transmit descriptors is optional (under the control of rs bit in the descriptor), not all processed descriptors are counted with respect to wthresh. descriptors start accumulating after a descriptor when rs is set. in addition, with transmit descriptor bursting enabled, some descriptors are written back that did not have rs set in their respective descriptors. note: when wthresh = 0b, only descriptors with the rs bit set are written back field bit(s) initial value description tdt 15:0 0x0 transmit descriptor tail. reserved 31:16 0x0 reserved. reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description pthresh 4:0 0x0 prefetch threshold. controls when a prefetch of descriptors is considered. this threshold refers to the number of valid, unprocessed transmit descriptors the 82576 has in its on-chip buffer. if this number drops below pthresh, the algorithm considers pre-fetching descriptors from host memory. however, this fetch does not happen unless there are at least hthresh valid descriptors in host memory to fetch. note: hthresh should be given a non zero value each time pthresh is used. reserved 7:5 0x0 reserved. hthresh 12:8 0x0 host threshold. reserved 15:13 0x0 reserved. reads as 0b. should be written as 0b for future compatibility.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 546 note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x3828, 0x3928, 0x3a28 and 0x3b28 respectively. wthresh 20:16 0x0 write-back threshold. controls the write-back of processed transmit descriptors. this threshold refers to the number of transmit descriptors in the on-chip buffer that are ready to be written back to host memory. in the absence of external events (explicit flushes), the write-back occurs only after at least wthresh descriptors are available for write-back. note: since the default value for write-back threshold is 0b, descriptors are normally written back as soon as they are processed. wthresh must be written to a non-zero value to take advantage of the write-back bursting capabilities of the 82576. reserved 24:21 0x0 reserved enable 25 1b/0b transmit queue enable when set, this bit enables the operation of a specific transmit queue: ? default value for q0 = 1b. ? default value for q15:1 = 0b. after a vf flr to vf0, q0 is also reset to zero. setting this bit initializes all the internal registers of a specific queue. until then, the state of the queue is kept and can be used for debug purposes. when disabling a queue, this bit is cleared only after all activity at the queue stopped. note: this bit is valid only if the queue is actually enabled, thus if tctl.txen is cleared, this bit remains zero. swflsh 26 0b transmit software flush. this bit enables software to trigger descriptor write-back flushing, independently of other conditions. this bit is self cleared by hardware. reserved 27 0b reserved (was priority). reserved 28 0b reserved. reserved 29 0 reserved. reserved 31:30 0x00 reserved. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 547 8.12.14 tx descriptor completion write?back address low - tdwbal (0x0e038 + 0x40*n [n=0...15]; r/w) note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x3838, 0x3938, 0x3a38 & 0x3b38 respectively. 8.12.15 tx descriptor completion write?back address high - tdwbah (0x0e03c + 0x40*n [n=0...15];r/w) note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x383c, 0x393c, 0x3a3c & 0x3b3c respectively. 8.13 dca register descriptions 8.13.1 rx dca control registers - rxctl (0x0c014 + 0x40*n [n=0...15]; r/w) note: rx data write no-snoop is activated when the nse bit is set in the receive descriptor. field bit(s) initial value description head_wb_en 0 0b head write-back enable. 1b = head write-back is enabled. 0b = head write-back is disabled. when head_wb_en is set, sn_wb_en is ignored and no descriptor write-back is executed. wb on eitr 1 0b when set, a head write back is done upon eitr expiration. headwb_lo w 31:2 0x0 bits 31:2 of the head write-back memory location (dword aligned). last 2 bits of this field are ignored and are always interpreted as 00b, meaning that the actual address is qword aligned. bits 1:0 are always 00b. field bit(s) initial value description headwb_hig h 31:0 0x0 highest 32 bits of the head write-back memory location.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 548 field bit(s) initial value description reserved 4:0 0x0 reserved. rx descriptor dca en 5 0b descriptor dca enable. when set, hardware enables dca for all rx descriptors written back into memory. when cleared, hardware does not enable dca for descriptor write-backs. this bit is cleared as a default. rx header dca en 6 0b rx header dca enable. when set, hardware enables dca for all received header buffers. when cleared, hardware does not enable dca for rx headers. this bit is cleared as a default. rx payload dca en 7 0b payload dca enable. when set, hardware enables dca for all ethernet payloads written into memory. when cleared, hardware does not enable dca for ethernet payloads. this bit is cleared as a default. rxdescread nsen 8 0b rx descriptor read no snoop enable. this bit must be reset to 0b to ensure correct functionality (except if the software driver can guarantee the data is present in the main memory before the dma process occur). rxdescread roen 9 1b rx descriptor read relax order enable. rxdescwbns en 10 0b rx descriptor write-back no snoop enable. this bit must be reset to 0b to ensure correct functionality of descriptor write-back. rxdescwbro en (ro) 11 0b rx descriptor write-back relax order enable. this bit must be reset to 0b to ensure correct functionality of descriptor write-back. rxdatawrite nsen 12 0b rx data write no snoop enable (header replication: header and data). when set to 0b, the last bit of the packet buffer address field in the advanced receive descriptor is used as the lsb of the packet buffer address (a0), thus enabling 8-bit alignment of the buffer. when set to 1b, the last bit of the packet buffer address field in advanced receive descriptor is used as the no-snoop enabling (nse) bit (buffer is 16-bit aligned). if also set to 1b, the nse bit determines whether the data buffer is snooped or not. rxdatawrite roen 13 1b rx data write relax order enable (header replication: header and data). rxrepheader nsen 14 0b rx replicated/split header no snoop enable. this bit must be reset to 0b to ensure correct functionality of header write to host memory.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 549 note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x2814, 0x2914, 0x2a14 & 0x2b14 respectively. 8.13.2 tx dca control registers - txctl (0x0e014 + 0x40*n [n=0...15]; r/w) rxrepheader roen 15 1b rx replicated/split header relax order enable. reserved 23:16 0b reserved. cpuid 31:24 0x0 pphysical id. legacy dca capable platforms - the device driver, upon discovery of the physical cpu id and cpu bus id, programs the cpuid field with the physical cpu and bus id associated with this rx queue. dca 1.0 capable platforms - the device driver programs a value, based on the relevant apic id, associated with this tx queue. see table 3.1.3.1.2.3 for details field bit(s) initial value description reserved 4:0 0 reserved. tx descriptor dca en 5 0b descriptor dca enable. when set, hardware enables dca for all tx descriptors written back into memory. when cleared, hardware does not enable dca for descriptor write-backs. this bit is cleared as a default and also applies to head write-back when enabled. reserved 7:6 00b reserved. txdescrdns en 8 0b tx descriptor read no snoop enable. this bit must be reset to 0b to ensure correct functionality (unless the software device driver has written this bit with a write-through instruction). txdescrdro en 9 1b tx descriptor read relax order enable. txdescwbns en 10 0b tx descriptor write-back no snoop enable. this bit must be reset to 0b to ensure correct functionality of descriptor write-back. also applies to head write-back, when enabled. txdescwbro en 11 1b tx descriptor write-back relax order enable. applies to head write-back, when enabled. txdataread nsen 12 0b tx data read no snoop enable. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 550 note: in order to keep compatibility with the 82575, for queues 0-3, these registers are aliased to addresses 0x3814, 0x3914, 0x3a14 & 0x3b14 respectively. 8.13.3 dca requester id information - dca_id (0x05b70; ro) the dca requester id field, composed of device id, bus #, and function # is set up in mmio space for software to program the dca requester id authentication register. txdataread roen 13 1b tx data read relax order enable. reserved 23:14 0 reserved. cpuid 31:24 0x0 physical id. legacy dca capable platforms - the device driver, upon discovery of the physical cpu id and cpu bus id, programs the cpuid field with the physical cpu and bus id associated with this tx queue. dca 1.0 capable platforms - the device driver programs a value, based on the relevant apic id, associated with this tx queue. see table 3.1.3.1.2.3 for details field bit(s) initial value description function number 2:0 000b function number. function number assigned to the function based on bios/ operating system enumeration. device number 7:3 0x0 device number. device number assigned to the function based on bios/ operating system enumeration. bus number 15:8 0x0 bus number. bus number assigned to the function based on bios/ operating system enumeration. reserved 31:16 0x0 reserved. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 551 8.13.4 dca control - dca_ctrl (0x05b74; r/w) 8.14 virtualization register descriptions for all the registers in this section, vt_ctl replaces the vmd_ctl register of the 82575. field bit(s) initial value description dca_dis 0 1b dca disable. 0b = dca tagging is enabled for this port. 1b = dca tagging is disabled for this port. dca_mode 4:1 0x0 dca mode. 000b = legacy dca is supported. the tag field in the tlp header is based on the following coding: bit 0 is dca enable; bits 3:1 are cpu id). 001b = dca 1.0 is supported. when dca is disabled for a given message, the tag field is 0000,0000b. if dca is enabled, the tag is set per queue as programmed in the relevant dca control register. all other values are undefined. reserved 31:5 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 552 8.14.1 next generation vmdq control register ? vt_ctl (0x0581c; r/w) 8.14.2 physical function mailbox - pfmailbox (0x0c00 + 4*n[n=0...7]; rw) field bit(s) initial value description reserved 6:0 0x0 reserved. def_pl 9:7 00b default pool - used to queue packets that did not pass any vm queuing decision. reserved 26:10 0x0 reserved. flp 27 0b filter local packets. filter incoming packets whose mac source address matches one of the incoming da mac addresses. if the sa of the received packet matches one of the da in the rah/ral registers, then the vm tied to this da does not receive the packet. other vms can still receive it. igmac 28 0x0 if set, mac address is ignored during pool decision. pooling is based on vlan only. if this bit is set, then the vmolr.strvlan should be set to the same value for all pools. dis_def_pool 29 0x0 drop if no poll is found. if this bit is asserted, then in a rx switching, in a virtualized environment, if there is no destination pool, the packet is discarded and not sent to the default pool. otherwise, it is sent to the pool defined by the def_pl field. rpl_en 30 0x0 replication enable. reserved 31 0x0 reserved. field bit(s) initial value description sts (wo) 0 0b status/command from pf ready. setting this bit, causes an interrupt to the relevant vf. this bit always read as zero. setting this bit sets the pfsts bit in vfmailbox. ack (wo) 1 0b vf message received. setting this bit, causes an interrupt to the relevant vf. this bit always read as zero. setting this bit sets the pfack bit in vfmailbox. vfu 2 0b buffer taken by vf. this bit is ro for the pf and is a mirror of the vfu bit of the vfmailbox register.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 553 the usage of the mailbox register set is described in section 7.10.2.9.1 . 8.14.3 virtual function mailbox - vfmailbox (0x0c40 + 4*n [n=0...7]; rw) 8.14.4 virtualization mailbox memory - vmbmem (0x0800:0x083c + 0x40*n [n=0...7]; r/w) mailbox memory for pf and vf drivers communication. locations can be accessed as 32-bit or 64-bit words. the memory is accessible to the pf and the vfs according to the following mapping. pfu 3 0b buffer taken by pf. this bit can be set only if the vfu bit is cleared and is mirrored in the pfu bit of the vfmailbox register. rvfu (wo) 4 0b reset vfu. resetting this bit clears the vfu bit in the corresponding vfmailbox register - this bit should be used only if the vf driver is stuck. setting this bit is also reset the corresponding bits in the mbvficr vfreq & vfack fields. reserved 31:5 0x0 reserved. field bit(s) initial value description req (wo) 0 0b request for pf ready. setting this bit, causes an interrupt to the pf. this bit always read as zero. setting this bit sets the corresponding bit in vfreq field in mbvficr register. ack (wo) 1 0b pf message received. setting this bit, causes an interrupt to the pf. this bit always read as zero. setting this bit sets the corresponding bit in vfack field in mbvficr register. vfu 2 0b buffer taken by vf. this bit can be set only if the pfu bit is cleared and is mirrored in the vfu bit of the pfmailbox register. pfu 3 0b buffer taken by pf. this bit is ro for the vf and is a mirror of the pfu bit of the pfmailbox register. pfsts (rc) 4 0b pf wrote a message in the mailbox. pfack (rc) 5 0b pf acknowledged the vf previous message. rsti 6 1b indicates that the pf had reset the shared resources and the reset sequence is in progress. rstd (rc) 7 0b indicates that a pf software reset completed and the vf can start to use the device. reserved 31:8 0x0 reserved. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 554 8.14.5 mailbox vf interrupt causes register - mbvficr (0x0c80; r/w1c) 8.14.6 mailbox vf interrupt mask register - mbvfimr (0x0c84; rw) 8.14.7 flr events - vflre (0x0c88; r/w1c) this register reflects the vflr events of the different vfs. it is accessible only to the pf. these bits are cleared by writing 1. ram address function pf bar 0 mapping 1 1. relative to vmbmem register. vf bar 0 mapping 0 - 63 vf0 ? pf 0 - 63 vmbmem:vmbmem + 63 64 - 127 vf1 ? pf 64 - 127 vmbmem:vmbmem + 63 .... 384 - 447 vf7 ? pf 384 - 447 vmbmem:vmbmem + 63 field bit(s) initial value description mailbox data 31:0 x mailbox data field bit(s) initial value description vfreq 7:0 0x0 vf #n wrote a message. reserved 15:8 0x0 reserved. vfack 23:16 0x0 vf #n acknowledged a pf message. reserved 31:24 0x0 reserved. field bit(s) initial value description vfim 7:0 0xff mailbox indication from vf #n can cause an interrupt to the pf.. reserved 31:8 0x0 reserved. field bit(s) initial value description vflr 7:0 x reflects a vflr event in vf7 to vf0 respectively. reserved 31:8 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 555 8.14.8 vf receive enable- vfre (0x0c8c; rw) 8.14.9 vf transmit enable - vfte (0x0c90; rw) note: clearing one of vfte bits may cause a transmit packet drop from the disabled queue. 8.14.10 wrong vm behavior register - wvbr (0x3554; rc) 8.14.11 vm error count mask ? vmecm (0x3510; rw) field bit(s) initial value description vfre 7:0 0xff enables filtering process to forward packets to vf7 to vf0 respectively. each bit is cleared by the relevant vflr or by a vf sw reset. reserved 31:8 0x0 reserved. field bit(s) initial value description vfte 7:0 0xff enables transmit process to forward packets from vf7 to vf0 respectively. each bit is cleared by the relevant vflr or by a vf sw reset. reserved 31:8 0x0 reserved. field bit(s) initial value description wvm 15:0 0x0 bitmap indicating against which queue an anti-spoof action was taken. 31:16 0x0 (ro ) - indicates queue that was blocked due to malicious behavior. field bit(s) initial value description filter 7:0 0x0 defines if a packet dropped from pools 0 to 7 respectively is counted in the ssvpc counter. reserved 31:8 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 556 8.14.12 last vm misbehavior cause ? lvmmc (0x3548; rc) 8.14.13 queue drop enable register - qde (0x2408;rw) this register allows the pf to override the srrctl.drop_en bit set by the vf, to avoid head of line blocking issues if an un-trusted vf does not provide a receive descriptor to the hardware. . 8.14.14 dma tx switch control - dtxswc (0x3500; r/w) this register controls the security settings of the switch and enables the loopback mode. field bit(s) initial value description mac spoof 0 0b a mac spoof attempt was detected. vlan spoof 1 0b a vlan spoof attempt was detected. legacy desc in rt/iov 2 0b a legacy desc in rt/iov was detected. out of sync - single send 3 0b an out of sync misbehavior was detected in a single send operation. out of sync - large send 4 0b an out of sync misbehavior was detected in a large send operation. reserved 11:5 0b reserved. l3 type 12 0b 0 = the error was detected in an ipv6 packet. 1 = the error was detected in an ipv4 packet. l4 type 14:13 0b indicates the l4 type of the erroneous packet: 00b = udp 01b = tcp 10b = sctp 11b = reserved reserved 15 0b reserved. queue 19:16 0x0 queue in which the illegal behavior was detected. reserved 31:20 0x0 reserved. field bit(s) initial value description qde 15:0 0x0 enable drop packets from queue 15:0 respectively. this bit overrides the srrctl.drop_en bit of each queue. if either of the bits is set, a packet received when no is descriptor available is dropped. reserved 31:16 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 557 8.14.15 vm vlan insert register ? vmvir (0x3700 + 4 *n [n=0..7]; rw) 8.14.16 vm offload register - vmolr (0x05ad0 + 4*n [n=0...7]; rw) this register controls the offload and queueing options applied to each vf. field bit(s) initial value description macas 7:0 0x0 enable anti spoofing filter on mac addresses for vf7 to vf0 respectively. vlanas 15:8 0x0 enable anti spoofing filter on vlan tags for vf7 to vf0 respectively. lle 23:16 0x0 local loopback enable . when set, a packet originating from pool n and destined to pool n is looped back. if clear, the packet is dropped. reserved 30:24 0x0 reserved. loopback_en 31 0b enable next generation vmdq loopback. field bit(s) initial value description port vlan id 15:0 0x0 port vlan tag to insert in case action = 1. reserved 29:16 0x0 reserved. vlana 31:30 0x0 vlan action: 00b = use descriptor command. 01b = always insert default vlan. 10b = never insert vlan. 11b = reserved. field bit(s) initial value description rlpml 13:0 0x2600 long packet size (9k default). reserved 15:14 0x0 reserved. lpe 16 0b long packet enable. rsse 17 0b 0b = when in rss + next generation vmdq mode (mrqc.multiple receive queues enable = 101b) packets for this vm is forwarded to the pool?s default queue as defined in vt_ctl.default next generation vmdq queue. 1b = when in rss + mode (mrqc.multiple receive queues enable = 101b) packets for this vm is forwarded according to rss redirection table (reta). reserved 23:18 0x0 reserved. aupe 24 0b accept untagged packets enable. when set, packets without vlan tag can be forwarded to this queue, assuming they pass the mac address queueing mechanism.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 558 8.14.17 replication offload register - rplolr (0x05af0; rw) this register describes the off loads applied to multicast packets. 8.14.18 vlan vm filter - vlvf (0x05d00 + 4*n [n=0...31]; rw) this register set describes which vlans the local vms are part of. each register contains a vlan tag and a list of the vfs which are part of it. only packets with a vlan matching one of the vlan tags of which the vf is member of are forwarded to this vf. 8.14.19 unicast table array - uta (0xa000 + 4*n [n=0...127]; wo) there is one register per 32 bits of the unicast address table for a total of 128 registers (the uta[127:0] designation). software must mask to the desired bit on reads and supply a 32-bit word on writes. the first bit of the address used to access the table is set according to the rctl.mo field. note: all accesses to this table must be 32 bit. the lookup algorithm is the same one used for the mta table. rompe 25 0x0 receive overflow multicast packets. accept packets that match the mta table. rope 26 0x0 receive overflow packets. accept packets that match the uta table. bam 27 0x0 broadcast accept. mpe 28 0x0 multicast promiscuous. reserved 29 0x0 reserved. strvlan 30 0x0 vlan strip. reserved 31 0x1 reserved. must be set to one. . field bit(s) initial value description reserved 29:0 0x0 reserved. strvlan 30 0x0 vlan strip. strcrc 31 0x1 reserved. field bit(s) initial value description vlan_id 11:0 0x0 defines a vlan tag to which each vm whose bit is set in the poolsel field is set belongs. poolsel 19:12 0x0 pool select (bitmap). lvlan 20 0x0 this vlan is local and packets with this vlan should not be forwarded to the external nic. reserved 30:21 0x0 vi_en 31 0b vlan id enable. this filter is valid.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 559 this table should be zeroed by software before start of work. the data returned when reading this table is unexpected. 8.14.20 storm control control register- sccrl (0x5db0;rw) 8.14.21 storm control status - scsts (0x5db4;ro) field bit(s) initial value description bit vector 31:0 x word wide bit vector specifying 32 bits in the unicast destination address filter table. field bit(s) initial value description mdipw 0 0b drop multicast packets (excluding flow control and manageability packets) if multicast threshold is exceeded in previous window mdicw 1 0b drop multicast packets (excluding flow control and manageability packets) if multicast threshold is exceeded in current window bdipw 2 0b drop broadcast packets (excluding flow control and manageability packets) if broadcast threshold is exceeded in previous window bdicw 3 0b drop broadcast packets (excluding flow control and manageability packets) if broadcast threshold is exceeded in current window bidu 4 0b bsc include destination unresolved: if bit is set, unicast received packets with no destination pool and sent to the default pool is included in ibsc rsvd 7:5 0x0 reserved. interval 17:8 0x8 bsc/msc time-interval-specification. the interval size for applying ingress broadcast or multicast storm control. interrupt decisions are made at the end of each interval (and most flags are also set at interval end). setting this field resets the counter. rsvd 31:18 0x0 reserved. field bit(s) initial value description bsca 0 0b broadcast storm control active. bscap 1 0b broadcast storm control active in previous window. msca 2 0b multicast storm control active. mscap 3 0b multicast storm control active in previous window. rsvd 31:4 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 560 8.14.22 broadcast storm control threshold - bsctrh (0x5db8;rw) 8.14.23 multicast storm control threshold - msctrh (0x5dbc; rw) 8.14.24 broadcast storm control current count - bsccnt (0x5dc0;ro) 8.14.25 multicast storm control current count - msccnt (0x5dc4;ro) 8.14.26 storm control time counter - sctc (0x5dc8; ro) this register keeps track of the number of time units elapsed since the end of last time interval. field bit(s) initial value description utresh 18:0 0x0 traffic upper threshold-size. represents the upper threshold for broadcast storm control. rsvd 31:19 0x0 reserved. field bit(s) initial value description utresh 18:0 0x0 traffic upper threshold-size. represents the upper threshold for multicast storm control. rsvd 31:19 0x0 reserved. field bit(s) initial value description ccount 24:0 0x0 ibsc traffic current count. represents the count of broadcast traffic received in the current time interval in units of 64-byte segments. rsvd 31:25 0x0 reserved. field bit(s) initial value description ccount 24:0 0x0 imsc traffic current count: represents the count of multicast traffic received in the current time interval in units of f 64-byte segments. rsvd 31:25 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 561 8.14.27 storm control basic interval- scbi (0x5dcc; rw) this register defines the basic interval used as the base for the sccrl.interval counting in 10 mb/s speed. this register is defined in 16 ns clock cycles. the interval in 1000/100 is 100 or 10 time smaller respectively. 8.14.28 virtual mirror rule control - vmrctl (0x5d80 + 0x4*n [n= 0..3]; rw) this register controls the rules to be applied and the destination port. 8.14.29 virtual mirror rule vlan - vmrvlan (0x5d90 + 0x4*n [n= 0..3]; rw) this register controls the vlan ports as listed in the vlvf table taking part in the vlan mirror rule. field bit(s) initial value description count 9:0 0x0 sc time counter: the counter for number of time units elapsed since the end of the last time interval. rsvd 31:10 0x0 reserved. field bit(s) initial value description bi 24:0 0x5f5e10 basic interval. rsvd 31:25 0x0 reserved. field bit(s) initial value description vpme 0 0b virtual pool mirroring enable. reflects all the packets sent to a set of given vms. upme 1 0b uplink port mirroring enable. reflects all the traffic received from the network. dpme 2 0b downlink port mirroring enable. reflects all the traffic transmitted to the network. this means that when this bit is set, transmit traffic is mirrored to the mirrored port. there is no mirroring to the network vlme 3 0b vlan mirroring enable. reflects all the traffic received in a set of given vlans. either from the network or from local vms. reserved 7:4 0x0 reserved. mp 10:8 0x0 vm mirror port destination. reserved 31:11 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 562 8.14.30 virtual mirror rule vm - vmrvm (0x5da0 + 0x4*n [n= 0..3]; rw) this register controls the vms mirrored to the mirror port if vmrctl.vpme is set. 8.14.31 transmit rate-er config - rc (0x36b0; rw) field bit(s) initial value description vlan 31:0 0x0 bitmap listing which vlans participate in the mirror rule. field bit(s) initial value description vm 7:0 0x0 bitmap listing which vms participate in the mirror rule. reserved 31:8 0x0 reserved. field bit(s) initial value description rf_dec 13:0 tx rate-scheduler rate factor hexadecimal part, for the tx queue indexed by txdq_idx field in dqsel register rate factor bits that come after the hexadecimal point. meaningful only if rs_ena bit is set. rf_int 23:14 tx rate-scheduler rate factor integer part, for the tx queue indexed by txdq_idx field in dqsel register rate factor bits that come before the hexadecimal point. rate factor is defined as the ratio between the nominal link rate (1 gb/s) and the maximum rate allowed to that queue. minimum allowed bandwidth share for a queue is 0.1% of the link rate (1 mb/s, leading to a maximum allowed rate factor of 1000). meaningful only if rs_ena bit is set. reserved 30:24 0 reserved rs_ena (sc) 31 0 rw tx rate-scheduler enable, for the tx queue indexed by txdq_idx field in dqsel register when set, the ate programmed in this register is enforced (the queue is rate controlled). at the time it is set, the current timer value is loaded into the timestamp stored for that entry. when cleared, the rate factor programmed in this register is meaningless, the switch for that queue is always forced to ?on?. the queue is not rate-controlled.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 563 8.14.32 transmit rate-er status - (0x36b4; ro) 8.15 tx bandwidth allocation to vm register description these registers are owned by the pf in an iov mode. 8.15.1 vm bandwidth allocation control & status - vmbacs (0x3600; rw) 8.15.2 vm bandwidth allocation max memory window - vmbammw (0x3670; rw) field bit(s) initial value description 8byte_val 7:0 0x04 8-bytes time counter value in dma clocks unit. link speed of 1gbps - must be set to 0x04 link speed of 100mbps - must be set to 0x28 reserved 15:8 0x08 reserved vmba_en1 19:16 0 vm bandwidth allocation enable field 1. 0x0 ? for non-virtualized contexts. 0x7 ? for virtualized contexts. vmba_set 20 0, ro vm bandwidth allocation is set. (ro) when set, it indicates that at least one queue is currently rate-controlled for achieving the bandwidth allocation scheme to vms. used by sw for the link speed change procedure. when cleared, the vm rate-controllers are all disabled. reserved 23:21 010b reserved - must be set to its initial value. vmba_en2 27:24 0x0 vm bandwidth allocation enable field 2. 0x0 ? for non-virtualized contexts. 0xf ? for virtualized contexts. reserved 30:28 0 reserved speed_chg 31 0 / read & clear only link speed has changed. set by hw to indicate that the link speed has changed. cleared by sw at the end of the link speed change procedure. field bit(s) initial value description mmw_size 10:0 0 max memory window size for the vm rate-controllers. it is the maximum amount of 1kb units of transmit compensation payload that can be accumulated for a tx queue. reserved 31:11 0 reserved
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 564 8.15.3 vm bandwidth allocation select - vmbasel (0x3604; rw) 8.15.4 vm bandwidth allocation config - vmbac (0x3608; rw) field bit(s) initial value description txq_idx 3:0 0 tx queue index. this register is used to select the tx queue for which the bandwidth share configuration will be set. prior to write access the vmrsc register, software has to set this field with the index of the tx queue to be accessed. reserved 31:4 0 reserved. field bit(s) initial value description rf_dec 13:0 x vm rate factor hexadecimal part, for the tx queue indexed by txq_idx field in vmbasel register. rate factor bits that come after the hexadecimal point. rate factor (rf) is defined as the ratio between the link speed (1 gb/s or 100mbps) and the rate allocated to that vm. assign rf to vms so that: ? sum(vm rates) = link speed, i.e. 1 gb/s or 100mbps. ? minimum allowed bandwidth share for a vm is 0.1% of the link speed. limit the maximum rate factor accordingly. ? meaningful only if rc_ena bit is set. rf_int 23:14 x vm rate factor integer part, for the tx queue indexed by txq_idx field in vmbasel register. rate factor bits that come before the hexadecimal point. meaningful only if rc_ena bit is set. reserved 30:24 0 reserved rc_ena 31 0 rw / ro if vt is fused-off vm rate-controller enable, for the tx queue indexed by txq_idx field in vmbasel register. when set, the bandwidth share allocated to the vm by programming this register is enforced, used for virtualized contexts. when cleared, other fields in this register are meaningless. the vm operates in the ?bandwidth takeover? mode, taking over the link?s bandwidth left unused by others. for non- virtualized contexts, this bit must be cleared for all the tx queues.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 565 8.16 timer register descriptions 8.16.1 watchdog setup - wdstp (0x01040; r/w) 8.16.2 watchdog software device status - wdswsts (0x01044; r/w) 8.16.3 free running timer - frtimer (0x01048; rws) this register reflects the value of a free running timer that can be used for various timeout indications. the register is reset by a pci reset and/or software reset. note: writing to this register is for dfx purposes only. field bit(s) initial value description wd_enable 0 0b 1 1. value read from the eeprom. enable watchdog timer. wd_timer_ load_enable (sc) 1 0b enables the load of the watchdog timer by writing to wd_timer field. if this bit is not set, the wd_timer field is loaded by the value of wd_timeout. note: writing to this field is only for dfx purposes. reserved 15:2 0x0 reserved wd_timer (rws) 23:16 wd_timeout indicates the current value of the timer. resets to the timeout value each time the 82576 functional bit in software device status register is set. if this timer expires, the wd interrupt to the firmware and the wd sdp is asserted. as a result, this timer is stuck at zero until it is re-armed. note: writing to this field is only for dfx purposes. wd_timeout 31:24 0x0 1 defines the number of seconds until the watchdog expires. the granularity of this timer is 1 sec. the minimal value allowed for this register when the watchdog mechanism is enabled is two. setting this field to 1b might cause the watchdog to expire immediately. field bit(s) initial value description dev_functio nal (sc) 0 0b each time this bit is set, the watchdog timer is re-armed. this bit is self clearing force_wd (sc) 1 0b setting this bit causes the wd timer to expire immediately. the wd_timer field is set to 0b. it can be used by software in order to indicate some fatal error detected in the software or in the hardware. this bit is self clearing. reserved 23:2 0x0 reserved. stuck reason 31:24 0x0 this field can be used by software to indicate to the firmware the reason the 82576 is malfunctioning. the encoding of this field is software/firmware dependent. a value of 0b indicates a functional the 82576.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 566 8.16.4 tcp timer - tcptimer (0x0104c; r/w) field bit(s) initial value description microsecond 9:0 x number of microseconds in the current millisecond. millisecond 19:10 x number of milliseconds in the current second. seconds 31:20 x number of seconds from the timer start (up to 4095 seconds). field bit(s) initial value description duration 7:0 0x0 duration. duration of the tcp interrupt interval in msec. kickstart (ws) 8 0b counter kick-start writing a 1b to this bit kick-starts the counter down-count from the initial value defined in the duration field. writing a 0b has no effect. tcpcounten 9 0b tcp count enable. 1b = tcp timer counting enabled. 0b = tcp timer counting disabled. once enabled, the tcp counter counts from its internal state. if the internal state is equal to 0b, the down-count does not restart until kickstart is activated. if the internal state is not 0b, the down-count continues from internal state. this enables a pause in the counting for debug purpose. tcpcountfini sh (ws) 10 0b tcp count finish. this bit enables software to trigger a tcp timer interrupt, regardless of the internal state. writing a 1b to this bit triggers an interrupt and resets the internal counter to its initial value. down-count does not restart until either kickstart is activated or loop is set. writing a 0b has no effect. loop 11 0b tcp loop. when set to 1b, the tcp counter reloads duration each time it reaches zero, and continues down-counting from this point without kick-starting. when set to 0b, the tcp counter stops at a zero value and does not re-start until kickstart is activated. note: setting this bit alone is not enough to start the timer activity. the kickstart bit should also be set. reserved 31:12 - reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 567 8.17 time sync register descriptions 8.17.1 rx time sync control register - tsyncrxctl (0xb620;rw) 8.17.2 rx timestamp low - rxstmpl (0x0b624; ro) 8.17.3 rx timestamp high - rxstmph (0x0b628; ro) field bit(s) initial value description rxtt(ro/v) 0 0x0 rx timestamp valid (='1' when a valid value for rx timestamp is captured in the rx timestamp register, clear by read of rx timestamp register rxsatrh) type 3:1 0x0 type of packets to timestamp. 000b ? time stamp l2 (v2) packets only (sync or delay_req depends on message type in section 8.17.23 and packets with message id 2 and 3) 001b ? time stamp l4 (v1) packets only (sync or delay_req depends on message type in section 8.17.23 ) 010b ? time stamp v2 (l2 and l4) packets (sync or delay_req depends on message type in section 8.17.23 and packets with message id 2 and 3) 100b ? time stamp all packets (in this mode no locking is done to the value in the timestamp registers and no indications in receive descriptors is transferred) 101b - time stamp all packets which message id bit 3 is zero, which means timestamp all event packets. this is applicable for v2 packets only. 011b, 110b and 111b ? reserved en 4 0b enable rx timestamp. 0 = time stamping disabled. 1 = time stamping enabled. reserved 9:5 0 reserved. rsv 31:6 0x0 reserved. field bit(s) initial value description rxstmpl 31:0 0x0 rx timestamp lsb value. field bit(s) initial value description rxstmph 31:0 0x0 rx timestamp msb value.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 568 8.17.4 rx timestamp attributes low - rxsatrl(0x0b62c; ro) 8.17.5 rx timestamp attributes high- rxsatrh (0x0b630; ro) 8.17.6 tx time sync control register - tsynctxctl (0x0b614; rw) 8.17.7 tx timestamp value low - txstmpl (0x0b618;ro) 8.17.8 tx timestamp value high - txstmph(0x0b61c; ro) field bit(s) initial value description sourceidl 31:0 0x0 sourceuuid low. field bit(s) initial value description sourceidh 15:0 0x0 sourceuuid high sequenceid 31:16 0x0 sequenceid field bit(s) initial value description txtt(ro/v) 0 0b tx timestamp valid (equals 1b when a valid value for tx timestamp is captured in the tx timestamp register, clear by read of tx timestamp register txstmph) rsv 3:1 0x0 reserved. en 4 0b enable tx timestamp. 0b = time stamping disabled. 1b = time stamping enabled. rsv 31:5 0x0 reserved. field bit(s) initial value description txstmpl 31:0 0x0 tx timestamp lsb value. field bit(s) initial value description txstmph 31:0 0x0 tx timestamp msb value.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 569 8.17.9 system time register low - systiml (0x0b600; rws) 8.17.10 system time register high - systimh (0x0b604; rws) 8.17.11 increment attributes register - timinca (0x0b608; rw) 8.17.12 time adjustment offset register low - timadjl (0x0b60c; rw) 8.17.13 time adjustment offset register high - timadjh (0x0b610;rw) field bit(s) initial value description stl 31:0 0x0 system time lsb value. field bit(s) initial value description sth 31:0 0x0 system time msb value. field bit(s) initial value description iv 23:0 0x0 increment value. ip 31:24 0x0 increment period in 16 ns resolution. field bit(s) initial value description tadjl 31:0 0x0 time adjustment value ? low. field bit(s) initial value description tadjh 30:0 0x0 time adjustment value - high. sign 31 0b sign (0b=?+?, 1b =?-?).
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 570 8.17.14 timesync auxiliary control register - tsauxc (0x0b640; rw) 8.17.15 target time register 0 low - trgttiml0 (0x0b644; rw) 8.17.16 target time register 0 high - trgttimh0 (0x0b648; rw) field bit(s) initial value description en_tt0 0 0b enable target time 0. enable bit is set by software to 1b for enabling the feature. the bit is cleared by hardware when the target time is hit. en_tt1 1 0b enable target time 1. enable bit is set by software to 1b for enabling the feature. the bit is cleared by hardware when the target time is hit. reserved 2 0b reserved. utt0 3 0b use target time 0 to reload clk_out 0 down counter st0 4 0b start clock out toggle only on target time0, at this point a rising edge of clock out occurs (the clock output is set to 0b on assertion of this bit). en_clk1 5 0b enable configurable frequency clock 1. reserved 5 0b reserved. utt1 6 0b use target time 1 to reload clk_out 1 down counter . st1 7 0b start clock out toggle only on target time1, at this point a rising edge of clock out occurs (the clock output is set to 1b on assertion of this bit). en_ts0 8 0b enable hardware time stamp 0. autt0 9 0b auxiliary timestamp taken - cleared when read from auxiliary timestamp 0 occurred. en_ts1 10 0b enable hardware time stamp 1. autt1 11 0b auxiliary timestamp taken - cleared when read from auxiliary timestamp 1 occurred. mask 16:12 0x0 masking value for target time and frequency clock accuracy control. the value in this field determines the masked bits in the comparison of the features (where 0b = no masking and 5'd16 the highest value allowed). rsv 31:17 0x0 reserved. field bit(s) initial value description ttl 31:0 0x0 target time 0 lsb register. field bit(s) initial value description tth 31:0 0x0 target time 0 msb register.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 571 8.17.17 target time register 1 low - trgttiml1 (0x0b64c; rw) 8.17.18 target time register 1 high - trgttimh1 (0x0b650; rw) 8.17.19 auxiliary time stamp 0 register low - auxstmpl0 (0x0b65c; ro) 8.17.20 auxiliary time stamp 0 register high -auxstmph0 (0x0b660; ro) reading this register will release the value stored in auxstmph/l0 and will allow stamping of the next value. 8.17.21 auxiliary time stamp 1 register low auxstmpl1 (0x0b664; ro) 8.17.22 auxiliary time stamp 1 register high - auxstmph1 (0x0b668; ro) reading this register will release the value stored in auxstmph/l1 and will allow stamping of the next value. field bit(s) initial value description ttl 31:0 0x0 target time 1 lsb register. field bit(s) initial value description tth 31:0 0x0 target time 1 msb register. field bit(s) initial value description tstl 31:0 0x0 auxiliary time stamp 0 lsb value. field bit(s) initial value description tsth 31:0 0x0 auxiliary time stamp 0 msb value. field bit(s) initial value description tstl 31:0 0x0 auxiliary time stamp 1 lsb value. field bit(s) initial value description tsth 31:0 0x0 auxiliary time stamp 1 msb value.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 572 8.17.23 time sync rx configuration - tsyncrxcfg (0x05f50; rw) 8.17.24 time sync sdp config reg - tssdp (0x0003c; rw) this register defines the assignment of sdp pins to the time sync auxiliary capabilities. field bit(s) initial value description ctrlt 7:0 0x0 v1 control to timestamp. msgt 11:8 0x0 v2 messageid to timestamp. trnsspc 15:12 0x0 v2 transport specific value to timestamp. reserved 31:16 0x0 reserved. field bit(s) initial value description aux0_sdp_s el 1:0 00b select one of the spds to serve as the trigger for auxiliary time stamp 0 (aux0). 00b = sdp0 is assigned 01b = sdp1 is assigned 10b = sdp2 is assigned 11b = sdp3 is assigned aux0_ts_sd p_en 2 0b when set indicates that one of the sdps can be used as an external trigger to aux timestamp 0 (note that if this bit is set to one of the sdp pins, the corresponding pin should be configured to input mode using spd_dir). aux1_spd_s el 4:3 00b select one of the spds to serve as the trigger for auxiliary time stamp 1 (aux1). 00b = sdp0 is assigned 01b = sdp1 is assigned 10b = sdp2 is assigned 11b = sdp3 is assigned aux1_ts_sd p_en 5 0b when set indicates that one of the sdps can be used as an external trigger to aux timestamp 1 (note that if this bit is set to one of the sdp pins, the corresponding pin should be configured to input mode using spd_dir). ts_sdp0_se l 7:6 00b sdp0 allocation to tsync event ? when ts_sdp0_en is set, these bits select the tsync event that is routed to sdp0. 00b = target time 0 is output on sdp0 01b = target time 1 is output on sdp0 10b - 11b = reserved ts_sdp0_en 8 0b when set indicates that sdp0 is assigned to tsync. ts_sdp1_se l 10:9 00b sdp1 allocation to tsync event ? when ts_sdp1_en is set, these bits select the tsync event that is routed to sdp1. 00b = target time 0 is output on sdp1 01b = target time 1 is output on sdp1 10b - 11b = reserved ts_sdp1_en 11 0b when set indicates that sdp1 is assigned to tsync.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 573 8.18 pcs register descriptions the usage of these registers is described in section 3.5.4.1 & section 3.5.4.2 8.18.1 pcs configuration - pcs_cfg (0x04200; r/w) ts_sdp2_se l 13:12 00b sdp2 allocation to tsync event ? when ts_sdp2_en is set, these bits select the tsync event that is routed to sdp2. 00b = target time 0 is output on sdp2 01b = target time 1 is output on sdp2 10b - 11b = reserved ts_sdp2_en 14 0b when set indicates that sdp2 is assigned to tsync. ts_sdp3_se l 16:15 00b sdp3 allocation to tsync event ? when ts_sdp3_en is set, these bits select the tsync event that is routed to sdp3. 00b = target time 0 is output on sdp3 01b = target time 1 is output on sdp3 10b - 11b = reserved ts_sdp3_en 17 0b when set indicates that sdp3 is assigned to tsync. reserved 31:18 0x0 reserved. field bit(s) initial value description reserved 2:0 000b reserved pcs enable 3 1b pcs enable. enables the pcs logic of the mac. should be set in both sgmii and serdes mode for normal operation. clearing this bit disables rx/tx of both data and control codes. use this to force link down at the far end. reserved 29:4 0x0 reserved. pcs isolate 30 0b pcs isolate. setting this bit isolates the pcs logic from the mac's data path. pcs control codes are still sent and received. sreset 31 0b soft reset. setting this bit puts all modules within the mac in reset except the host interface. the host interface is reset via hrst. this bit is not self clearing; gmac is in a reset state until this bit is cleared. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 574 8.18.2 pcs link control - pcs_lctl (0x04208; rw) field bit(s) initial value description reserved 0 0b reserved fsv 2:1 10b forced speed value. these bits denote the speed when force speed and duplex is set. this value is also used when an is disabled or when in serdes mode. 00b = 10 mb/s (sgmii). 01b = 100 mb/s (sgmii). 10b = 1000 mb/s (serdes/sgmii). 11b = reserved. fdv 3 1b forced duplex value. this bit denotes the duplex mode when force speed and duplex is set. this value is also used when an is disabled or when in serdes mode. 1b = full duplex (serdes/sgmii). 0b = half duplex (sgmii). fsd 4 0b force speed and duplex. if this bit is set, then speed and duplex mode is forced to forced speed value and forced duplex value, respectively. otherwise, speed and duplex mode are decided by internal an/sync state machines. reserved 5 0b reserved - must be set to zero. link latch low 6 0b link latch low enable. if this bit is set, then link ok going low (negative edge) is latched until a processor read. afterwards, link ok is continuously updated until link ok again goes low (negative edge is seen). force flow control 7 0b 0 = flow control mode is set according to the an process by following table 37-4 in the ieee 802.3 spec. 1 = flow control is set according to fc_tx_en / fc_rx_en bits in ctrl register. reserved 15:8 - reserved. an_enable 16 0b 1 an enable. setting this bit enables the an process. an restart 17 0b an restart. setting this bit restarts the an process. this bit is self clearing. an timeout en 18 1b an timeout enable. this bit enables the an timeout feature. during an, if the link partner does not respond with an pages, but continues to send good idle symbols, then link up is assumed. (this enables link up condition when link partner is not an- capable and does not affect otherwise). this bit should not be set in sgmii mode.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 575 8.18.3 pcs link status - pcs_lsts (0x0420c; ro) an sgmii bypass 19 0b an sgmii bypass. if this bit is set, then idle detect state is bypassed during an in sgmii mode. this reduces the acknowledge time in sgmii mode. an sgmii trigger 20 1b an sgmii trigger. if this bit is cleared, then an is not automatically triggered in sgmii mode even if sync fails. an is triggered only in response to phy messages or by a manual setting like changing the an enable/restart bits. reserved 23:21 000b reserved. fast link timer 24 0b fast link timer. an timer is reduced if this bit is set. link ok fix en 25 1b link ok fix enable. control for enabling/disabling linkok/syncok fix. should be set for normal operation. reserved 26 0b reserved. reserved 31:27 0x0 reserved 1. read from eeprom word 0x0f, bit 11. field bit(s) initial value description link ok 0 0b link ok. this bit denotes the current link ok status. 0b = link down. 1b = link up/ok. speed 2:1 10b speed. this bit denotes the current operating speed. 00b = 10 mb/s. 01b = 100 mb/s. 10b = 1000 mb/s. 11b = reserved. duplex 3 1b duplex. this bit denotes the current duplex mode. 1b = full duplex. 0b = half duplex. sync ok 4 0b sync ok. this bit indicates the current value of sync ok from the pcs sync state machine. reserved 15:5 - reserved. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 576 8.18.4 an advertisement - pcs_anadv (0x04218; r/w) an complete 16 0b an complete. this bit indicates that the an process has completed.this bit is set when the an process reached the link ok state. it is reset upon an restart or reset. it is set even if the an negotiation failed and no common capabilities where found. an page received 17 0b an page received. this bit indicates that a link partner's page was received during an an process. this bit is cleared on reads. an timedout 18 0b an timed out. this bit indicates an an process was timed out. valid after the an complete bit is set. an remote fault 19 0b an remote fault. this bit indicates that an an page was received with a remote fault indication during an an process. this bit cleared on reads. an error (rws) 20 0b an error. this bit indicates that a an error condition was detected in serdes/sgmii mode. valid after the an complete bit is set. an error conditions: ? serdes mode: both node not full duplex ? sgmii mode: phy is set to 1000 mb/s half duplex mode. ? software can also force a an error condition by writing to this bit (or can clear a existing an error condition). ? this bit is cleared at the start of an. reserved 31:21 0x0 reserved field bit(s) initial value description reserved 4:0 - reserved fdcap 5 1b full duplex. setting this bit indicates that the 82576 is capable of full duplex operation. this bit should be set to 1b for normal operation. hdcap (ro) 6 0b half duplex. this bit indicates that the 82576 is capable of half duplex operation. this bit is tied to 0b because the 82576 does not support half duplex in serdes mode. asm 8:7 0b 1 local pause capabilities. the 82576's pause capability is encoded in this field. 00b = no pause. 01b = symmetric pause. 10b = asymmetric pause to link partner. 11b = both symmetric and asymmetric pause to the 82576. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 577 8.18.5 link partner ability - pcs_lpab (0x0421c; ro) reserved 11:9 - reserved rflt 13:12 00b remote fault. the 82576's remote fault condition is encoded in this field. the 82576 might indicate a fault by setting a non-zero remote fault encoding and re-negotiating. 00b = no error, link ok. 01b = link failure. 10b = offline. 11b = auto-negotiation error. reserved 14 - reserved. nextp 15 0b next page capable. the 82576 asserts this bit to request a next page transmission. the 82576 clears this bit when no subsequent next pages are requested. reserved 31:16 0x0 reserved. 1. loaded from eeprom word 0x0f, bits 13:12. field bit(s) initial value description reserved 4:0 - reserved lpfd 5 0b lp full duplex (serdes). when set to 1b, the link partner is capable of full duplex operation. when set to 0b, the link partner is not capable of full duplex mode. this bit is reserved while in sgmii mode. lphd 6 0b lp half duplex (serdes). when set to 1b, the link partner is capable of half duplex operation. when set to 0b, the link partner is not capable of half duplex mode. this bit is reserved while in sgmii mode. lpasm 8:7 00b lp asmdr/lp pause (serdes). the link partner's pause capability is encoded in this field. 00b = no pause. 01b = symmetric pause. 10b = asymmetric pause to link partner. 11b = both symmetric and asymmetric pause to the 82576. these bits are reserved while in sgmii mode. reserved 9 - reserved. sgmii speed 11:10 00b serdes: reserved. speed (sgmii): speed indication from the phy. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 578 8.18.6 next page transmit - pcs_nptx (0x04220; rw) prf 13:12 00b lp remote fault (serdes) the link partner's remote fault condition is encoded in this field. 00b = no error, link ok. 10b = link failure. 01b = offline. 11b = auto-negotiation error. sgmii[13]: reserved sgmii[12]: duplex mode indication from the phy. ack 14 0b acknowledge (serdes) the link partner has acknowledge page reception. sgmii: reserved. lpnextp 15 0b lp next page capable (serdes) the link partner asserts this bit to indicate its ability to accept next pages. sgmii: link-ok indication from the phy. reserved 31:16 - reserved. field bit(s) initial value description code 10:0 0x0 message/unformatted code field. the message field is an 11-bit wide field that encodes 2048 possible messages. unformatted code field is an 11-bit wide field that might contain an arbitrary value. toggle 11 0b toggle. this bit is used to ensure synchronization with the link partner during next page exchange. this bit always takes the opposite value of the toggle bit in the previously exchanged link code word. the initial value of the toggle bit in the first next page transmitted is the inverse of bit 11 in the base link code word and, therefore, can assume a value of 0b or 1b. the toggle bit is set as follows: 0b = previous value of the transmitted link code word when 1b 1b = previous value of the transmitted link code word when 0b. ack2 12 0b acknowledge 2. used to indicate that a device has successfully received its link partners' link code word. pgtype 13 0b message/unformatted page. this bit is used to differentiate a message page from an unformatted page. the encoding is: 0b = unformatted page. 1b = message page. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 579 8.18.7 link partner ability next page - pcs_lpabnp (0x04224; ro) reserved 14 - reserved. nxtpg 15 0b next page. used to indicate whether or not this is the last next page to be transmitted. the encoding is: 0b = last page. 1b = additional next pages follow. reserved 31:16 - reserved. field bit(s) initial value description code 10:0 - message/unformatted code field. the message field is an 11-bit wide field that encodes 2048 possible messages. unformatted code field is an 11-bit wide field that might contain an arbitrary value. toggle 11 - toggle. this bit is used to ensure synchronization with the link partner during next page exchange. this bit always takes the opposite value of the toggle bit in the previously exchanged link code word. the initial value of the toggle bit in the first next page transmitted is the inverse of bit 11 in the base link code word and, therefore, can assume a value of 0b or 1b. the toggle bit is set as follows: 0b = previous value of the transmitted link code word when 1b 1b = previous value of the transmitted link code word when 0b. ack2 12 - acknowledge 2. used to indicate that a device has successfully received its link partners' link code word. msgpg 13 - message page. this bit is used to differentiate a message page from an unformatted page. the encoding is: 0b = unformatted page. 1b = message page. ack 14 - acknowledge. the link partner has acknowledged next page reception. nxtpg 15 - next page. used to indicate whether or not this is the last next page to be transmitted. the encoding is: 0b = last page. 1b = additional next pages follow. reserved 31:16 - reserved. field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 580 8.18.8 sfp i2c command- i2ccmd (0x01028; r/w) this register is used by software to read or write to the configuration registers in an sfp module. note: according to the sfp specification, only reads are allowed from this interface; however, sfp vendors also provide a writable register through this interface (for example, phy registers). as a result, write capability is also supported. 8.18.9 sfp i2c parameters - i2cparams (0x0102c; r/w) this register is used to set the parameters for the i 2 c access to the sfp module and to allow bit bang access to the i 2 c interface. field bit(s) initial value description data 15:0 x data. in a write command, software places the data bits and then the mac shifts them out to the i 2 c bus. in a read command, the mac reads these bits serially from the i 2 c bus and then software reads them from this location. note: this field is read in byte order not in word order. regadd 23:16 0x0 i 2 c register address. for example, register 0, 1, 2, . . . 255. phyadd 26:24 0x0 device address bits 3 -1 the actual address used is b{1010, phyadd[2:0], 0}. op 27 0b op code 0b = i 2 c write. 1b = i 2 c read. reset 28 0b reset sequence. if set, sends a reset sequence before the actual read or write. this bit is self clearing. a reset sequence is defined as nine consecutive stop conditions. r 29 0b ready bit. set to 1b by the 82576 at the end of the i 2 c transaction. for example, indicates a read or write has completed. reset by a software write of a command. reserved 30 0b reserved e 31 0b error. this bit set is to 1b by hardware when it fails to complete an i 2 c read. reset by a software write of a command. field bit(s) initial value description write time 4:0 110b write time. defines the delay between a write access and the next access. the value is in milliseconds. a value of zero is not valid. read time 7:5 010b read time. defines the delay between a read access and the next access. the value is in microseconds. a value of zero is not valid
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 581 8.19 statistics register descriptions all statistics registers reset when read. in addition, they stick at 0xffff_ffff when the maximum value is reached. for the receive statistics it should be noted that a packet is indicated as received if it passes the 82576's filters and is placed into the packet buffer memory. a packet does not have to be transferred to host memory in order to be counted as received. due to divergent paths between interrupt-generation and logging of relevant statistics counts, it might be possible to generate an interrupt to the system for a noteworthy event prior to the associated statistics count actually being incremented. this is extremely unlikely due to expected delays associated with the system interrupt-collection and isr delay, but might be observed as an interrupt for which statistics values do not quite make sense. hardware guarantees that any event noteworthy of inclusion in a statistics count is reflected in the appropriate count within 1 ? s; a small time-delay prior to a read of statistics might be necessary to avoid the potential for receiving an interrupt and observing an inconsistent statistics count as part of the isr. 8.19.1 crc error count - crcerrs (0x04000; rc) counts the number of receive packets with crc errors. in order for a packet to be counted in this register, it must pass address filtering and must be 64 bytes or greater (from through , inclusively) in length. if receives are not enabled, then this register does not increment. i2cbb_en 8 0b i 2 c bit bang enable. if set, the i 2 c_clk and i 2 c_data lines are controlled via the clk, data and data_oe_n fields of this register. otherwise, they are controlled by the hardware machine activated via the i2ccmd or mdic registers. clk 9 0b i 2 c clock. while in bit bang mode, controls the value driven on the i2c_clk pad of this port. data_out 10 0b i 2 c_data. while in bit bang mode and when the data_oe_n field is zero, controls the value driven on the i2c_data pad of this port. data_oe_n 11 0b i 2 c_data_oe_n. while in bit bang mode, controls the direction of the i2c_data pad of this port. 0b = pad is output. 1b = pad is input. data_in (ro) 12 x i 2 c_data_in. reflects the value of the i2c_data pad. while in bit bang mode and when the data_oe_n field is zero, this field reflects the value set in the data_out field. reserved 31:13 0x0 reserved. field bit(s) initial value description cec 31:0 0x0 crc error count.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 582 8.19.2 alignment error count - algnerrc (0x04004; rc) counts the number of receive packets with alignment errors (the packet is not an integer number of bytes in length). in order for a packet to be counted in this register, it must pass address filtering and must be 64 bytes or greater (from through , inclusive) in length. if receives are not enabled, then this register does not increment. this register is valid only in mii mode during 10/100 mb/s operation. 8.19.3 symbol error count - symerrs (0x04008; rc) counts the number of symbol errors between reads. the count increases for every bad symbol received, whether or not a packet is currently being received and whether or not the link is up. when working in serdes/sgmii mode these statistics can be read from the scvpc register. 8.19.4 rx error count - rxerrc (0x0400c; rc) counts the number of packets received in which rx_er was asserted by the phy. in order for a packet to be counted in this register, it must pass address filtering and must be 64 bytes or greater (from through , inclusive) in length. if receives are not enabled, then this register does not increment. this register is not available in serdes/sgmii modes. 8.19.5 missed packets count - mpc (0x04010; rc) counts the number of missed packets. packets are missed when the receive fifo has insufficient space to store the incoming packet. this can be caused because of too few buffers allocated, or because there is insufficient bandwidth on the pci bus. events setting this counter cause rxo, the receiver overrun interrupt, to be set. this register does not increment if receives are not enabled. these packets are also counted in the total packets received register as well as in total octets received. table 8-21. single collision count - scc (0x04014; rc) this register counts the number of times that a successfully transmitted packet encountered a single collision. this register only increments if transmits are enabled and the 82576 is in half-duplex mode. field bit(s) initial value description aec 31:0 0x0 alignment error count. field bit(s) initial value description symerrs 31:0 0x0 symbol error count. field bit(s) initial value description rxec 31:0 0x0 rx error count. field bit(s) initial value description mpc 31:0 0x0 missed packets count.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 583 8.19.6 excessive collisions count - ecol (0x04018; rc) when 16 or more collisions have occurred on a packet, this register increments, regardless of the value of collision threshold. if collision threshold is set below 16, this counter won?t increment. this register only increments if transmits are enabled and the 82576 is in half-duplex mode. 8.19.7 multiple collision count - mcc (0x0401c; rc) this register counts the number of times that a transmit encountered more than one collision but less than 16. this register only increments if transmits are enabled and the 82576 is in half-duplex mode. 8.19.8 late collisions count - latecol (0x04020; rc) late collisions are collisions that occur after one slot time. this register only increments if transmits are enabled and the 82576 is in half-duplex mode. 8.19.9 collision count - colc (0x04028; rc) this register counts the total number of collisions seen by the transmitter. this register only increments if transmits are enabled and the 82576 is in half-duplex mode. this register applies to clear as well as secure traffic. 8.19.10 defer count - dc (0x04030; rc) this register counts defer events. a defer event occurs when the transmitter cannot immediately send a packet due to the medium being busy either because another device is transmitting, the ipg timer has not expired, half-duplex deferral events, reception of xoff frames, or the link is not up. this register only increments if transmits are enabled. this counter does not increment for streaming transmits that are deferred due to tx ipg. field bit(s) initial value description scc 31:0 0x0 number of times a transmit encountered a single collision. field bit(s) initial value description ecc 31:0 0x0 number of packets with more than 16 collisions. field bit(s) initial value description mcc 31:0 0x0 number of times a successful transmit encountered multiple collisions. field bit(s) initial value description lcc 31:0 0x0 number of packets with late collisions. field bit(s) initial value description ccc 31:0 0x0 total number of collisions experienced by the transmitter.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 584 8.19.11 transmit with no crs - tncrs (0x04034; rc) this register counts the number of successful packet transmissions in which the crs input from the phy was not asserted within one slot time of start of transmission from the mac. start of transmission is defined as the assertion of tx_en to the phy. the phy should assert crs during every transmission. failure to do so might indicate that the link has failed, or the phy has an incorrect link configuration. this register only increments if transmits are enabled. this register is not valid in sgmii mode and is only valid when the 82576 is operating at half duplex. 8.19.12 host transmit discarded packets by mac count - htdpmc (0x0403c; rc) this register counts the number of packets sent by the host (and not the manageability engine) that are dropped by the mac. this can include packets dropped because of excessive collisions or link fail events. 8.19.13 receive length error count - rlec (0x04040; rc) this register counts receive length error events. a length error occurs if an incoming packet passes the filter criteria but is undersized or oversized. packets less than 64 bytes are undersized. packets over 1518/1522/1526 bytes (according to the number of vlan tags present) are oversized if long packet enable (lpe) is 0b. if lpe is 1b, then an incoming, packet is considered oversized if it exceeds the size defined in rlpml.rlpml field. if receives are not enabled, this register does not increment. these lengths are based on bytes in the received packet from through , inclusive. note: runt packets smaller than 25 bytes may not be counted by this counter. field bit(s) initial value description cdc 31:0 0x0 number of defer events. field bit(s) initial value description tncrs 31:0 0x0 number of transmissions without a crs assertion from the phy. field bit(s) initial value description htdpmc 31:0 0x0 number of packets sent by the host but discarded by the mac. field bit(s) initial value description rlec 31:0 0x0 number of packets with receive length errors.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 585 8.19.14 circuit breaker rx dropped packet- cbrdpc (0x04044; rc) this register counts the number of circuit breaker rx dropped packets. this counter counts only packets that passed the layer 2 filtering and where sent to the host by rx filter, but where dropped according to circuit breaker decision. 8.19.15 xon received count - xonrxc (0x04048; rc) this register counts the number of valid xon packets received. xon packets can use the global address, or the station address. this register only increments if receives are enabled. 8.19.16 xon transmitted count - xontxc (0x0404c; rc) this register counts the number of xon packets transmitted. these can be either due to a full queue or due to software initiated action (using tctl.swxoff). this register only increments if transmits are enabled. 8.19.17 xoff received count - xoffrxc (0x04050; rc) this register counts the number of valid xoff packets received. xoff packets can use the global address or the station address. this register only increments if receives are enabled. 8.19.18 xoff transmitted count - xofftxc (0x04054; rc) this register counts the number of xoff packets transmitted. these can be either due to a full queue or due to software initiated action (using tctl.swxoff). this register only increments if transmits are enabled. field bit(s) initial value description cbrdpc 31:0 0 circuit breaker rx dropped packet. field bit(s) initial value description xonrxc 31:0 0x0 number of xon packets received. field bit(s) initial value description xontxc 31:0 0x0 number of xon packets transmitted. field bit(s) initial value description xoffrxc 31:0 0x0 number of xoff packets received. field bit(s) initial value description xofftxc 31:0 0x0 number of xoff packets transmitted.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 586 8.19.19 fc received unsupported count - fcruc (0x04058; rc) this register counts the number of unsupported flow control frames that are received. the fcruc counter increments when a flow control packet is received that matches either the reserved flow control multicast address (in fcah/l) or the mac station address, and has a matching flow control type field match (to the value in fct), but has an incorrect opcode field. this register only increments if receives are enabled. 8.19.20 packets received [64 bytes] count - prc64 (0x0405c; rc) this register counts the number of good packets received that are exactly 64 bytes (from through , inclusive) in length. packets that are counted in the missed packet count register are not counted in this register. packets sent to the manageability engine are included in this counter. this register does not include received flow control packets and increments only if receives are enabled. 8.19.21 packets received [65?127 bytes] count - prc127 (0x04060; rc) this register counts the number of good packets received that are 65-127 bytes (from through , inclusive) in length. packets that are counted in the missed packet count register are not counted in this register. packets sent to the manageability engine are included in this counter. this register does not include received flow control packets and increments only if receives are enabled. 8.19.22 packets received [128?255 bytes] count - prc255 (0x04064; rc) this register counts the number of good packets received that are 128-255 bytes (from through , inclusive) in length. packets that are counted in the missed packet count register are not counted in this register. packets sent to the manageability engine are included in this counter. this register does not include received flow control packets and increments only if receives are enabled. field bit(s) initial value description fcruc 31:0 0x0 number of unsupported flow control frames received. field bit(s) initial value description prc64 31:0 0x0 number of packets received that are 64 bytes in length. field bit(s) initial value description prc127 31:0 0x0 number of packets received that are 65-127 bytes in length. field bit(s) initial value description prc255 31:0 0x0 number of packets received that are 128-255 bytes in length.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 587 8.19.23 packets received [256?511 bytes] count - prc511 (0x04068; rc) this register counts the number of good packets received that are 256-511 bytes (from through , inclusive) in length. packets that are counted in the missed packet count register are not counted in this register. packets sent to the manageability engine are included in this counter. this register does not include received flow control packets and increments only if receives are enabled. 8.19.24 packets received [512?1023 bytes] count - prc1023 (0x0406c; rc) this register counts the number of good packets received that are 512-1023 bytes (from through , inclusive) in length. packets that are counted in the missed packet count register are not counted in this register. packets sent to the manageability engine are included in this counter. this register does not include received flow control packets and increments only if receives are enabled. 8.19.25 packets received [1024 to max bytes] count - prc1522 (0x04070; rc) this register counts the number of good packets received that are from 1024 bytes to the maximum (from through , inclusive) in length. the maximum is dependent on the current receiver configuration (for example, lpe, etc.) and the type of packet being received. if a packet is counted in receive oversized count, it is not counted in this register (see section 8.19.37 ). this register does not include received flow control packets and only increments if the packet has passed address filtering and receives are enabled. packets sent to the manageability engine are included in this counter. due to changes in the standard for maximum frame size for vlan tagged frames in 802.3, the 82576 accepts packets that have a maximum length of 1522 bytes. the rmon statistics associated with this range has been extended to count 1522 byte long packets. if ctrl.extended_vlan is set, packets up to 1526 bytes are counted by this counter. field bit(s) initial value description prc511 31:0 0x0 number of packets received that are 256-511 bytes in length. field bit(s) initial value description prc1023 31:0 0x0 number of packets received that are 512-1023 bytes in length. field bit(s) initial value description prc1522 31:0 0x0 number of packets received that are 1024-max bytes in length.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 588 8.19.26 good packets received count - gprc (0x04074; rc) this register counts the number of good packets received of any legal length. the legal length for the received packet is defined by the value of long packet enable (rctl.lpe) (see section 8.19.37 ). this register does not include received flow control packets and only counts packets that pass filtering. this register only increments if receives are enabled. this register does not count packets counted by the missed packet count (mpc) register. packets sent to the manageability engine are included in this counter. note: gprc can count packets interrupted by a link disconnect although they have a crc error. 8.19.27 broadcast packets received count - bprc (0x04078; rc) this register counts the number of good (no errors) broadcast packets received. this register does not count broadcast packets received when the broadcast address filter is disabled. this register only increments if receives are enabled. this register does not count packets counted by the missed packet count (mpc) register. packets sent to the manageability engine are included in this counter. 8.19.28 multicast packets received count - mprc (0x0407c; rc) this register counts the number of good (no errors) multicast packets received. this register does not count multicast packets received that fail to pass address filtering nor does it count received flow control packets. this register only increments if receives are enabled. this register does not count packets counted by the missed packet count (mpc) register. packets sent to the manageability engine are included in this counter. 8.19.29 good packets transmitted count - gptc (0x04080; rc) this register counts the number of good (no errors) packets transmitted. a good transmit packet is considered one that is 64 or more bytes in length (from through , inclusively) in length. this does not include transmitted flow control packets. this register only increments if transmits are enabled. the register counts clear as well as secure packets. field bit(s) initial value description gprc 31:0 0x0 number of good packets received (of any length). field bit(s) initial value description bprc 31:0 0x0 number of broadcast packets received. field bit(s) initial value description mprc 31:0 0x0 number of multicast packets received. field bit(s) initial value description gptc 31:0 0x0 number of good packets transmitted.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 589 8.19.30 good octets received count - gorcl (0x04088; rc) these registers make up a 64-bit register that counts the number of good (no errors) octets received. this register includes bytes received in a packet from the field through the field, inclusive; gorcl must be read before gorch. in addition, it sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. only octets of packets that pass address filtering are counted in this register. this register does not count octets of packets counted by the missed packet count (mpc) register. octets of packets sent to the manageability engine are included in this counter. this register only increments if receives are enabled. these octets do not include octets of received flow control packets. 8.19.31 good octets received count - gorch (0x0408c; rc) 8.19.32 good octets transmitted count - gotcl (0x04090; rc) these registers make up a 64-bit register that counts the number of good (no errors) packets transmitted. this register must be accessed using two independent 32-bit accesses; gotcl must be read before gotch. in addition, it sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. this register includes bytes transmitted in a packet from the field through the field, inclusive. this register counts octets in successfully transmitted packets that are 64 or more bytes in length. this register only increments if transmits are enabled. the register counts clear as well as secure octets. these octets do not include octets in transmitted flow control packets. 8.19.33 good octets transmitted count - gotch (04094; rc) field bit(s) initial value description gorcl 31:0 0x0 number of good octets received ? lower 4 bytes. field bit(s) initial value description gorch 31:0 0x0 number of good octets received ? upper 4 bytes. field bit(s) initial value description gotcl 31:0 0x0 number of good octets transmitted ? lower 4 bytes. field bit(s) initial value description gotch 31:0 0x0 number of good octets transmitted ? upper 4 bytes.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 590 8.19.34 receive no buffers count - rnbc (0x040a0; rc) this register counts the number of times that frames were received when there were no available buffers in host memory to store those frames (receive descriptor head and tail pointers were equal). the packet is still received if there is space in the fifo. this register only increments if receives are enabled. note: this register does not increment when flow control packets are received. 8.19.35 receive undersize count - ruc (0x040a4; rc) this register counts the number of received frames that passed address filtering, and were less than minimum size (64 bytes from through , inclusive), and had a valid crc. this register only increments if receives are enabled. 8.19.36 receive fragment count - rfc (0x040a8; rc) this register counts the number of received frames that passed address filtering, and were less than minimum size (64 bytes from through , inclusive), but had a bad crc (this is slightly different from the receive undersize count register). this register only increments if receives are enabled. note: runt packets smaller than 25 bytes may not be counted by this counter. 8.19.37 receive oversize count - roc (0x040ac; rc) this register counts the number of received frames with valid crc field that passed address filtering, and were greater than maximum size. packets over 1522 bytes are oversized if longpacketenable (rctl.lpe) is 0b. if longpacketenable is 1b, then an incoming packet is considered oversized if it exceeds the value set in the rlpml register. in next generation vmdq mode, a packet is counted only if it is bigger than the vomlr.rlpml value for all the vfs that where supposed to receive the packet. if receives are not enabled, this register does not increment. these lengths are based on bytes in the received packet from through , inclusive. note: the maximum size of a packet when lpe is 0b is fixed according to the ctrl_ext.extended_vlan bit and the detection of a vlan tag in the packet. field bit(s) initial value description rnbc 31:0 0x0 number of receive no buffer conditions. field bit(s) initial value description ruc 31:0 0x0 number of receive undersize errors. field bit(s) initial value description rfc 31:0 0x0 number of receive fragment errors.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 591 8.19.38 receive jabber count - rjc (0x040b0; rc) this register counts the number of received frames that passed address filtering, and were greater than maximum size and had a bad crc (this is slightly different from the receive oversize count register). packets over 1518/1522/1526 bytes are oversized if lpe is 0b. if lpe is 1b, then an incoming packet is considered oversized if it exceeds rlpml.lpml bytes. if receives are not enabled, this register does not increment. these lengths are based on bytes in the received packet from through , inclusive. note: the maximum size of a packet when lpe is 0b is fixed according to the ctrl_ext.extended_vlan bit and the detection of a vlan tag in the packet. 8.19.39 management packets received count - mngprc (0x040b4; rc) this register counts the total number of packets received that pass the management filters as described in the total cost of ownership (tco) system management bus interface application note. any packets with errors are not counted, except packets that are dropped because the management receive fifo is full. packets sent to both the host and the management interface are not counted by this counter. 8.19.40 bmc management packets received count - bmngprc (0x0413c; rc) this register counts the total number of packets received that pass the management filters as described in the total cost of ownership (tco) system management bus interface application note. any packets with errors are not counted, except packets that are dropped because the management receive fifo is full. this register is available only to firmware. field bit(s) initial value description roc 31:0 0x0 number of receive oversize errors. field bit(s) initial value description rjc 31:0 0x0 number of receive jabber errors. field bit(s) initial value description mngprc 31:0 0x0 number of management packets received. field bit(s) initial value description mngprc 31:0 0x0 number of management packets received.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 592 8.19.41 management packets dropped count - mpdc (0x040b8; rc) this register counts the total number of packets received that pass the management filters as described in the total cost of ownership (tco) system management bus interface application note, that are dropped because the management receive fifo is full. management packets include any packet directed to the manageability console (for example, mc and arp packets). 8.19.42 bmc management packets dropped count - bmpdc (0x04140; rc) this register counts the total number of packets received that pass the management filters as described in the total cost of ownership (tco) system management bus interface application note, that are dropped because the management receive fifo is full. management packets include any packet directed to the manageability console (for example, mc and arp packets). this register is available only to firmware. 8.19.43 management packets transmitted count - mngptc (0x040bc; rc) this register counts the total number of transmitted packets originating from the manageability path. 8.19.44 bmc management packets transmitted count - bmngptc (0x04144; rc) this register counts the total number of transmitted packets originating from the manageability path. this register is available to the firmware only. 8.19.45 total octets received - torl (0x040c0; rc) these registers make up a logical 64-bit register which counts the total number of octets received. this register must be accessed using two independent 32-bit accesses; torl must be read before torh. this register sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. field bit(s) initial value description mpdc 31:0 0x0 number of management packets dropped. field bit(s) initial value description mpdc 31:0 0x0 number of management packets dropped. field bit(s) initial value description mptc 31:0 0x0 number of management packets transmitted. field bit(s) initial value description mptc 31:0 0x0 number of management packets transmitted.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 593 all packets received have their octets summed into this register, regardless of their length, whether they are erred, or whether they are flow control packets. this register includes bytes received in a packet from the field through the field, inclusive. this register only increments if receives are enabled. note: broadcast rejected packets are counted in this counter (as opposed to all other rejected packets that are not counted). 8.19.46 total octets received - torh (0x040c4; rc) 8.19.47 total octets transmitted - totl (0x040c8; rc) these registers make up a 64-bit register that counts the total number of octets transmitted. this register must be accessed using two independent 32-bit accesses; totl must be read before toth. this register sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. all transmitted packets have their octets summed into this register, regardless of their length or whether they are flow control packets. this register includes bytes transmitted in a packet from the field through the field, inclusive. octets transmitted as part of partial packet transmissions (for example, collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. 8.19.48 total octets transmitted - toth (0x040cc; rc) 8.19.49 total packets received - tpr (0x040d0; rc) this register counts the total number of all packets received. all packets received are counted in this register, regardless of their length, whether they have errors, or whether they are flow control packets. this register only increments if receives are enabled. note: broadcast rejected packets are counted in this counter (as opposed to all other rejected packets that are not counted). runt packets smaller than 25 bytes may not be counted by this counter. field bit(s) initial value description torl 31:0 0x0 number of total octets received ? lower 4 bytes. field bit(s) initial value description torh 31:0 0x0 number of total octets received ? upper 4 bytes. field bit(s) initial value description totl 31:0 0x0 number of total octets transmitted ? lower 4 bytes. field bit(s) initial value description toth 31:0 0x0 number of total octets transmitted ? upper 4 bytes.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 594 tpr can count packets interrupted by a link disconnect although they have a crc error. 8.19.50 total packets transmitted - tpt (0x040d4; rc) this register counts the total number of all packets transmitted. all packets transmitted are counted in this register, regardless of their length, or whether they are flow control packets. partial packet transmissions (collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. this register counts all packets, including standard packets, secure packets, packets received over the smbus, and packets generated by the pt function. 8.19.51 packets transmitted [64 bytes] count - ptc64 (0x040d8; rc) this register counts the number of packets transmitted that are exactly 64 bytes (from through , inclusive) in length. partial packet transmissions (collisions in half-duplex mode) are not included in this register. this register does not include transmitted flow control packets (which are 64 bytes in length). this register only increments if transmits are enabled. this register counts all packets, including standard packets, secure packets, packets received over the smbus, and packets generated by the pt function. 8.19.52 packets transmitted [65?127 bytes] count - ptc127 (0x040dc; rc) this register counts the number of packets transmitted that are 65-127 bytes (from through , inclusive) in length. partial packet transmissions (for example, collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. this register counts all packets, including standard packets, secure packets, packets received over the smbus, and packets generated by the pt function. field bit(s) initial value description tpr 31:0 0x0 number of all packets received. field bit(s) initial value description tpt 31:0 0x0 number of all packets transmitted. field bit(s) initial value description ptc64 31:0 0x0 number of packets transmitted that are 64 bytes in length. field bit(s) initial value description ptc127 31:0 0x0 number of packets transmitted that are 65-127 bytes in length.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 595 8.19.53 packets transmitted [128?255 bytes] count - ptc255 (0x040e0; rc) this register counts the number of packets transmitted that are 128-255 bytes (from through , inclusive) in length. partial packet transmissions (collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. this register counts all packets, including standard packets, secure packets, packets received over the smbus, and packets generated by the pt function. 8.19.54 packets transmitted [256?511 bytes] count - ptc511 (0x040e4; rc) this register counts the number of packets transmitted that are 256-511 bytes (from through , inclusive) in length. partial packet transmissions (for example, collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. this register counts all packets, including standard and secure packets. management packets must never be more than 200 bytes. 8.19.55 packets transmitted [512?1023 bytes] count - ptc1023 (0x040e8; rc) this register counts the number of packets transmitted that are 512-1023 bytes (from through , inclusive) in length. partial packet transmissions (for example, collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. this register counts all packets, including standard and secure packets. management packets must never be more than 200 bytes. 8.19.56 packets transmitted [1024 bytes or greater] count - ptc1522 (0x040ec; rc) this register counts the number of packets transmitted that are 1024 or more bytes (from through , inclusive) in length. partial packet transmissions (for example, collisions in half-duplex mode) are not included in this register. this register only increments if transmits are enabled. field bit(s) initial value description ptc255 31:0 0x0 number of packets transmitted that are 128-255 bytes in length. field bit(s) initial value description ptc511 31:0 0x0 number of packets transmitted that are 256-511 bytes in length. field bit(s) initial value description ptc1023 31:0 0x0 number of packets transmitted that are 512-1023 bytes in length.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 596 due to changes in the standard for maximum frame size for vlan tagged frames in 802.3, the 82576 transmits packets that have a maximum length of 1522 bytes. the rmon statistics associated with this range has been extended to count 1522 byte long packets. this register counts all packets, including standard and secure packets (management packets must never be more than 200 bytes). if ctrl.extended_vlan is set, packets up to 1526 bytes are counted by this counter. 8.19.57 multicast packets transmitted count - mptc (0x040f0; rc) this register counts the number of multicast packets transmitted. this register does not include flow control packets and increments only if transmits are enabled. counts clear as well as secure traffic. 8.19.58 broadcast packets transmitted count - bptc (0x040f4; rc) this register counts the number of broadcast packets transmitted. this register only increments if transmits are enabled. this register counts all packets, including standard and secure packets (management packets must never be more than 200 bytes). 8.19.59 tcp segmentation context transmitted count - tsctc (0x040f8; rc) this register counts the number of tcp segmentation offload transmissions and increments once the last portion of the tcp segmentation context payload is segmented and loaded as a packet into the on- chip transmit buffer. note that it is not a measurement of the number of packets sent out (covered by other registers). this register only increments if transmits and tcp segmentation offload are enabled. this counter only counts pure tso transmissions. 8.19.60 circuit breaker rx manageability packet count - cbrmpc (0x040fc; rc) field bit(s) initial value description ptc1522 31:0 0x0 number of packets transmitted that are 1024 or more bytes in length. field bit(s) initial value description mptc 31:0 0x0 number of multicast packets transmitted. field bit(s) initial value description bptc 31:0 0x0 number of broadcast packets transmitted count. field bit(s) initial value description tsctc 31:0 0x0 number of tcp segmentation contexts transmitted count. field bit(s) initial value description cbrmpc 31:0 0 total number of rx packets sent by circuit breaker to the manageability path.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 597 this register counts the total number of rx packets sent by circuit breaker to the manageability path. this register will only increment if receive path and circuit breaker are enabled. this counter counts only packets that passed the layer 2 filtering and where sent to the host by rx filter, but where redirected to the manageability path according to circuit breaker decision. 8.19.61 interrupt assertion count - iac (0x04100; rc) this counter counts the total number of lan interrupts generated in the system. in case of msi-x systems, this counter reflects the total number of msi-x messages that are emitted. 8.19.62 rx packets to host count - rpthc (0x04104; rc) 8.19.63 debug counter 1 - dbgc1 (0x04108; rc) field bit(s) initial value description iac 31:0 0x0 this is a count of all the lan interrupt assertions that have occurred. field bit(s) initial value description rpthc 31:0 0x0 this is a count of all the received packets sent to the host. field bit(s) initial value description dbgc1 31:0 0x0 this field counts the events according to the value of the pbdiag.stat_sel field. the list of possible values for this counter are described in table 8-22 : table 8-22. dbgc1 values stat sel counter 1 content 0 number of switch packets read from dbu0 1 the number of tx descriptor wb transactions performed for q0 2 the number of rx descriptor wb transactions performed for q0 3 the number of rx descriptor immediate wb transactions performed for q0 4 the number of tx host descriptors read by the descriptor handler 5 the number of rx host descriptors read by the descriptor processor 6 the number of tx data read requests done by the dhost 7 the number of tx packets sent to dbu
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 598 8.19.64 debug counter 2 - dbgc2 (0x0410c; rc) 8.19.65 debug counter 3 - dbgc3 (0x04110; rc) field bit(s) initial value description dbgc2 31:0 0x0 this field counts the events according to the value of the pbdiag.stat_sel field. the list of possible values for this counter are described in table 8-23 . table 8-23. dbgc2 values stat sel counter 2 content 0 number of rx filter packets read from dbu0 1 the number of tx descriptor wb transactions performed for q1 2 the number of rx descriptor wb transactions performed for q1 3 the number of rx descriptor immediate wb transactions performed for q1 4 the number of tx host descriptors written back to host 5 the number of rx host descriptors written back to host. 6 the number of rx data write requests done by the dhost. 7 reserved field bit(s) initial value description dbgc3 31:0 0x0 this field counts the events according to the value of the pbdiag.stat_sel field. the list of possible values for this counter are described in table 8-24 : table 8-24. dbgc3 values stat sel counter 3 content 0 number of switch packets read from dbu1. 1 the number of tx descriptor wb transactions performed for q2. 2 the number of rx descriptor wb transactions performed for q2. 3 the number of rx descriptor immediate wb transactions performed for q2. 4 the number of dropped tx packets. 5 the number of rx packets written to the dbu. 6 the number of pci write requests done by the dma. 7 the total amount of single send packets.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 599 8.19.66 debug counter 4 - dbgc4 (0x0411c; rc) 8.19.67 host good packets transmitted count-hgptc (0x04118; rc) this register counts the number of good (non-erred) packets transmitted sent by the host. a good transmit packet is considered one that is 64 or more bytes in length (from through , inclusively) in length. this does not include transmitted flow control packets or packets sent by the manageability engine. this register only increments if transmits are enabled. 8.19.68 receive descriptor minimum threshold count-rxdmtc (0x04120; rc) this register counts the number of events where the number of descriptors in one of the rx queues was lower than the threshold defined for this queue. field bit(s) initial value description dbgc4 31:0 0x0 this field counts the events according to the value of the pbdiag.stat_sel field. the list of possible values for this counter are described in table 8-25 : table 8-25. dbgc4 values stat sel counter 4 content 0 number of rx filter packets read from dbu1. 1 the number of tx descriptor wb transactions performed for q3. 2 the number of rx descriptor wb transactions performed for q3. 3 the number of rx descriptor immediate wb transactions performed for q3. 4 the number of tx packets read from the dbu. 5 the number of rx packets read from the dbu. 6 number of pci read/write requests done by the dhost. 7 the total amount of large send packets. field bit(s) initial value description hgptc 31:0 0x0 number of good packets transmitted by the host. field bit(s) initial value description rxdmtc 31:0 0x0 this is a count of the receive descriptor minimum threshold events.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 600 8.19.69 host tx circuit breaker dropped packets count- htcbdpc (0x04124; rc) this register counts the number of packets sent by the host (and not the manageability engine) that are dropped by the circuit breaker filters. 8.19.70 host good octets received count - hgorcl (0x04128; rc) 8.19.71 host good octets received count - hgorch (0x0412c; rc) these registers make up a logical 64-bit register which counts the number of good (non-erred) octets received. this register includes bytes received in a packet from the field through the field, inclusive. this register must be accessed using two independent 32-bit accesses.; hgorcl must be read before hgorch. in addition, it sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. only packets that pass address filtering are counted in this register. this register counts only octets of packets that reached the host. the only exception is packets dropped by the dma because of lack of descriptors in one of the queues. these packets are included in this counter. this register only increments if receives are enabled. 8.19.72 host good octets transmitted count - hgotcl (0x04130; rc) field bit(s) initial value description htcbdpc 0-31 0 number of packets sent by the host but discarded by the circuit breaker. field bit(s) initial value description hgorcl 31:0 0x0 number of good octets received by host ? lower 4 bytes field bit(s) initial value description hgorch 31:0 0x0 number of good octets received by host ? upper 4 bytes. field bit(s) initial value description hgotcl 31:0 0x0 number of good octets transmitted by host ? lower 4 bytes.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 601 8.19.73 host good octets transmitted count - hgotch (0x04134; rc) these registers make up a logical 64-bit register which counts the number of good (non-erred) packets transmitted. this register must be accessed using two independent 32-bit accesses. this register resets whenever the upper 32 bits are read (hgotch). in addition, it sticks at 0xffff_ffff_ffff_ffff when the maximum value is reached. this register includes bytes transmitted in a packet from the field through the field, inclusive. this register counts octets in successfully transmitted packets which are 64 or more bytes in length. this register only increments if transmits are enabled. the register counts clear as well as secure octets. these octets do not include octets in transmitted flow control packets or manageability packets. packets blocked by circuit breaker mechanism are not counted by this counter. 8.19.74 length error count - lenerrs (0x04138; rc) counts the number of receive packets with length errors. for example, valid packets (no crc error) with a length/type field with a value smaller or equal to 1500 greater than the frame size. in order for a packet to be counted in this register, it must pass address filtering and must be 64 bytes or greater (from through , inclusive) in length. if receives are not enabled, then this register does not increment. 8.19.75 serdes/sgmii code violation packet count - scvpc (0x04228; rw) this register contains the number of code violation packets received. code violation is defined as an invalid received code in the middle of a packet. 8.19.76 switch security violation packet count - ssvpc (0x41a0; rc) this register counts tx packets dropped due to switch security violations such as an sa anti spoof filtering. valid only in next generation vmdq or iov mode. field bit(s) initial value description hgotch 31:0 0x0 number of good octets transmitted by host ? upper 4 bytes. field bit(s) initial value description lenerrs 31:0 0x0 length error count. field bit(s) initial value description codevio 31:0 0x0 code violation packet count: at any point of time this field specifies number of unknown protocol packets received. valid only in sgmii/serdes mode.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 602 8.19.77 switch drop packet count - sdpc (0x41a4; rc) 8.20 wake up control register descriptions 8.20.1 wakeup control register - wuc (0x05800; r/w) the pme_en and pme_status bits of this register are reset on internal_power_on_reset event. when aux_pwr = 0b, this register is also reset by de-asserting pe_rst_n and during a d3 to d0 transition. the other bits are reset using the standard internal resets. 8.20.2 wakeup filter control register - wufc (0x05808; r/w) this register is used to enable each of the pre-defined and flexible filters for wakeup support. a value of 1b means the filter is turned on.; a value of 0b means the filter is turned off. field bit(s) initial value description ssvpc 31:0 0x0 switch security violation packet count: this register counts tx packets dropped due to switch security violations such as an sa anti spoof filtering. field bit(s) initial value description sdpc 31:0 0x0 switch drop packet count: this register counts rx packets dropped at the pool selection stage of the switch or by the storm control mechanism. for example, packets that where not routed to any of the pools and the vt_ctl.dis_def_pool is set. valid only in next generation vmdq or iov mode. field bit(s) initial value description apme 0 0b 1 1. loaded from the eeprom. advance power management enable. if set to 1b, apm wakeup is enabled. if this bit is set and the apmpme bit is cleared, reception of a magic packet asserts the wus.mag bit but does not assert a pme. pme_en 1 0b pme_en. this read/write bit is used by the software device driver to access the pme_en bit of the power management control / status register (pmcsr) without writing to the pcie configuration space. pme_status 2 0b pme_status. this bit is set when the 82576 receives a wakeup event. it is the same as the pme_status bit in the power management control / status register (pmcsr). writing a 1b to this bit clears the pme_status bit in the pmcsr. apmpme 3 0b 1 assert pme on apm wakeup. if set to 1b, the 82576 sets the pme_status bit in the power management control / status register (pmcsr) and asserts pme# when apm wakeup is enabled and the 82576 receives a matching magic packet. reserved 31:4 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 603 if the notco bit is set, then any packet that passes the manageability packet filtering described in the total cost of ownership (tco) system management bus interface application note does not cause a wake up event even if it passes one of the wake up filters. 8.20.3 wakeup status register - wus (0x05810; r/w1c) this register is used to record statistics about all wakeup packets received. if a packet matches multiple criteria then multiple bits could be set. writing a 1b to any bit clears that bit. this register is not cleared when rst# is asserted. it is only cleared when internal_power_on_reset is de-asserted or when cleared by the software device driver. field bit(s) initial value description lnkc 0 0b link status change wakeup enable. mag 1 0b magic packet wakeup enable. ex 2 0b directed exact wakeup enable. 1 1. if the rctl.upe is set, and the ex bit is set also, any unicast packet wakes up the system. mc 3 0b directed multicast wakeup enable. bc 4 0b broadcast wakeup enable. arp 5 0b arp request packet wakeup enable. ipv4 6 0b directed ipv4 packet wakeup enable. ipv6 7 0b directed ipv6 packet wakeup enable. reserved 14:8 0b reserved. set these bits to 0b. notco 15 0 ignore tco/management packets for wakeup. flx0 16 0b flexible filter 0 enable. flx1 17 0b flexible filter 1 enable. flx2 18 0b flexible filter 2 enable. flx3 19 0b flexible filter 3 enable. flx4 20 0b flexible filter 4 enable. flx5 21 0b flexible filter 5 enable. reserved 31:22 0x0 reserved. field bit(s) initial value description lnkc 0 0b link status change. mag 1 0b magic packet received. ex 2 0b directed exact packet received the packet?s address matched one of the 16 pre-programmed exact values in the receive address registers. mc 3 0b directed multicast packet received the packet was a multicast packet hashed to a value that corresponded to a 1 bit in the multicast table array.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 604 8.20.4 wakeup packet length - wupl (0x05900; ro) this register indicates the length of the first wakeup packet received. it is valid if one of the bits in the wakeup status register (wus) is set. it is not cleared by any reset. 8.20.5 wakeup packet memory - wupm (0x05a00 + 4*n [n=0...31]; ro) this register is read-only and it is used to store the first 128 bytes of the wakeup packet for software retrieval after system wakeup. it is not cleared by any reset. 8.20.6 ip address valid - ipav (0x5838; r/w) the ip address valid indicates whether the ip addresses in the ip address table are valid. bc 4 0b broadcast packet received. arp 5 0b arp request packet received. ipv4 6 0b directed ipv4 packet received. ipv6 7 0b directed ipv6 packet received. mng 8 0b indicates that a manageability event that should cause a pme happened. reserved 15:9 0b reserved. flx0 16 0b flexible filter 0 match. flx1 17 0b flexible filter 1 match. flx2 18 0b flexible filter 2 match. flx3 19 0b flexible filter 3 match. flx4 20 0b flexible filter 4 match. flx5 21 0b flexible filter 5 match. reserved 31:22 0b reserved. field bit(s) initial value description len 11:0 x length of wakeup packet. (if jumbo frames is enabled and the packet is longer than 2047 bytes then this field is 2047.) reserved 31:12 0x0 reserved. field bit(s) initial value description wupd 31:0 x wakeup packet data. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 605 8.20.7 ipv4 address table - ip4at (0x05840 + 8*n [n=0...3]; r/ w) the ipv4 address table is used to store the four ipv4 addresses for the arp/ipv4 request packet and directed ip packet wakeup. note: this table is not cleared by any reset. 8.20.8 ipv6 address table - ip6at (0x05880 + 4*n [n=0...3]; r/ w) the ipv6 address table is used to store the ipv6 addresses for neighbor discovery packet filtering and directed ip packet wakeup. note: this table is not cleared by any reset. field bit(s) initial value description v40 0 0b ipv4 address 0 valid. v41 1 0b ipv4 address 1 valid. v42 2 0b ipv4 address 2 valid. v43 3 0b ipv4 address 3 valid. reserved 15:4 0x0 reserved. v60 16 0b ipv6 address 0 valid. reserved 31:17 0b reserved. field bit(s) initial value description ip address 31:0 x ipv4 address n. field dword # address bit(s) initial value description ipv4addr0 0 0x5840 31:0 x ipv4 address 0. ipv4addr1 2 0x5848 31:0 x ipv4 address 1. ipv4addr2 4 0x5850 31:0 x ipv4 address 2. ipv4addr3 6 0x5858 31:0 x ipv4 address 3. field bit(s) initial value description ip address 31:0 x ipv6 address bytes 4*n+1:4*n +4. field dword # address bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 606 8.20.9 flexible host filter table registers - fhft (0x09000 - 0x093fc; rw) each of the 4 flexible host filters table registers (fhft) contains a 128b pattern and a corresponding 128-bit mask array. if enabled, the first 128 bytes of the received packet are compared against the non-masked bytes in the fhft register. each 128b filter is composed of 32 dw entries, where each 2 dws are accompanied by an 8-bit mask, one bit per filter byte. note: the length field must be 8 bytes aligned. for filtering packets shorter than 8 bytes aligned the values should be rounded up to the next 8 bytes aligned value, the hardware implementation compares 8 bytes at a time so it should get extra zero masks (if needed) until the end of the length value. the last dw of each filter contains a length field defining the number of bytes from the beginning of the packet compared by this filter, the length field should be 8 bytes aligned value. if actual packet length is less than (length - 8) (length is the value specified by the length field), the filter fails. otherwise, it depends on the result of actual byte comparison. the value should not be greater than 128. . . . ipv6addr0 0 0x5880 31:0 x ipv6 address 0, bytes 1-4. 1 0x5884 31:0 x ipv6 address 0, bytes 5-8. 2 0x5888 31:0 x ipv6 address 0, bytes 9-12. 3 0x588c 31:0 x ipv6 address 0, bytes 16-13. 31 8 31 8 7 0 31 0 31 0 reserved reserved mask [7:0] dw 1 dw 0 reserved reserved mask [15:8] dw 3 dw 2 reserved reserved mask [23:16] dw 5 dw 4 reserved reserved mask [31:24] dw 7 dw 6 31 8 31 8 7 0 31 0 31 0 reserved reserved mask [127:120] dw 29 dw 28 length reserved mask [127:120] dw 31 dw 30 field dword address bit(s) initial value filter 0 dw0 0 0x09000 31:0 x filter 0 dw1 1 0x09004 31:0 x filter 0 mask[7:0] 2 0x09008 7:0 x
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 607 accessing the fhft registers during filter operation can result in a packet being mis-classified if the write operation collides with packet reception. it is therefore advised that the flex filters are disabled prior to changing their setup. 8.20.10 flexible host filter table extended registers - fhft_ext (0x09a00 - 0x09bfc; rw) each of the 2 additional flexible host filters table extended registers (fhft_ext) contains a 128b pattern and a corresponding 128-bit mask array. if enabled, the first 128 bytes of the received packet are compared against the non-masked bytes in the fhft_ext register. the layout and access rules of this table are the same as in fhft. 8.21 management register descriptions all management registers are controlled by the remote mc for both read and write. host accesses to the management registers are blocked (read and write) unless debug write is enabled. the attributes for the fields in this chapter refer to the mc access rights. 8.21.1 management vlan tag value - mavtv (0x5010 +4*n [n=0...7]; rw) where ?n? is the vlan filter serial number, equal to 0,1,?7. the mavtv registers are written by the mc and are not accessible to the host for writing. the registers are used to filter manageability packets as described in the management chapter. reserved 3 0x0900c x filter 0 dw2 4 0x09010 31:0 x ? filter 0 dw30 60 0x090f0 31:0 x filter 0 dw31 61 0x090f4 31:0 x filter 0 mask[127:12 0] 62 0x090f8 7:0 x length 63 0x090fc 6:0 x field bit(s) initial value description vid 11:0 0x0 contain the vlan id that should be compared with the incoming packet if the corresponding bit in mfval.vlan is set. rsv 31:12 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 608 8.21.2 management flex udp/tcp ports - mfutp (0x5030 + 4*n [n=0...7]; rw) where each 32-bit register (n=0,?,7) refers to two port filters (register 0 refers to ports 0&1, register 2 refers to ports 2&3, etc.). the mfutp registers are written by the mc and are not accessible to the host for writing. the registers are used to filter manageability packets.see section 10.4 . reset - the mfutp registers are cleared on internal_power_on_reset only. the initial values for this register can be loaded from the eeprom after power-up reset. note: the mfutp_even & mfutp_odd fields should be written in network order. 8.21.3 management ethernet type filters- metf (0x5060 + 4*n [n=0...3]; rw) the metf registers are written by the mc and are not accessible to the host for writing. the registers are used to filter manageability packets. see section 10.4 . reset - the metf registers are cleared on internal_power_on_reset only. the initial values for this register might be loaded from the eeprom after power-up reset. 8.21.4 management control register - manc (0x05820; rw) field bit(s) initial value description mfutp_even 15:0 0x0 i management flex udp/tcp port. mfutp_odd 31:16 0x0 i+1 management flex udp/tcp port. field bit(s) initial value description metf 15:0 0x0 ethertype value to be compared against the l2 ethertype field in the rx packet. reserved 29:16 0x0 reserved polarity 30 0b 0b = positive filter - forward packets matching this filter to the manageability block. 1b = negative filter - block packets matching this filter from the manageability block. reserved 31 0b reserved. field bit(s) initial value description reserved 15:0 0x0 reserved. tco_reset 16 0b tco reset occurred. set to 1b on a tco reset. this bit is only reset by internal_power_on_resetinternal_power_on_reset.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 609 8.21.5 manageability filters valid - mfval (0x5824; rw) the manageability filters valid register indicates which filter registers contain a valid entry. rcv_tco_en 17 0b receive tco packets enabled. when this bit is set it enables the receive flow from the wire to the mng block. keep_phy_lin k_up 18 0b block phy reset and power state changes. when this bit is set the phy reset and power state changes does not get to the phy, this bit can not be written unless keep_phy_link_up_en eeprom bit is set. this bit is reset bit: internal_power_on_reset. rcv_all 19 0b receive all enable. when set, all received packets that passed l2 filtering is directed to the mng block. rcv_all_mc st 20 0b receive all multicast: when set, all received multicast packets pass l2 filtering and might be directed to the mng block by one of the decision filters. broadcast packets are not forwarded by this bit. en_mng2ho st 21 0b enable mng packets to host memory. this bit enables the functionality of the manc2h register. when set the packets that are specified in the manc2h registers is forwarded to the host memory too, if they pass manageability filters. bypass vlan 22 0b when set, vlan filtering is bypassed for mng packets. en_xsum_fi lter 23 0b enable checksum filtering to mng. when this bit is set, only packets that pass l3,l4 checksum is sent to the mng block. en_ipv4_fil ter 24 0b enable ipv4 address filters. when set, the last 128 bits of the mipaf register are used to store 4 ipv4 addresses for ipv4 filtering. when cleared, these bits store a single ipv6 filter. fixed_net_ type 25 0b fixed net type. if set, only packets matching the net type defined by the net_type field passes to manageability. otherwise, both tagged and un-tagged packets can be forwarded to the manageability engine. net_type 26 0b net type. 0b = pass only un-tagged packets. 1b = pass only vlan tagged packets. valid only if fixed_net_type is set. macsec mode 27 0b when set, only packets that matches one of the following 3 conditions will be forwarded to the manageability: ? the packet is a macsec packet authenticated and/or decrypted adequately by the hw. ? the packet ethertype matchesmetf[2] ? the packet ethertype matches metf[3]. reserved 31:28 0b reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 610 reset - the mfval register is cleared on internal_power_on_reset and firmware reset. 8.21.6 management control to host register - manc2h (0x5860; rw) the manc2h register allows forwarding of manageability packets to the host based on the decision filter that forwarded it to the manageability micro-controller. each manageability decision filter (mdef) has a corresponding bit in the manc2h register. when a manageability decision filter (mdef) forwards a packet to manageability, it also forwards the packet to the host if the corresponding manc2host bit is set and if the en_mng2host bit is set. the en_mng2host bit serves as a global enable for the manc2h bits. field bit(s) initial value 1 1. the initial values for this register can be loaded from the eeprom after power-up reset or firmware reset. the mfval register is written by the mc and not accessible to the host for writing. description mac 3:0 0x0 mac . indicates if the mac unicast filter registers (mmah, mmal) contain valid mac addresses. bit 0 corresponds to filter 0, etc. reserved 7:4 0x0 reserved. vlan 15:8 0x0 vlan. indicates if the vlan filter registers (mavtv) contain valid vlan tags. bit 8 corresponds to filter 0, etc. ipv4 19:16 0x0 ipv4. indicates if the ipv4 address filters (mipaf) contain valid ipv4 addresses. bit 16 corresponds to ipv4 address 0. these bit apply only when ipv4 address filters are enabled. (manc.en_ipv4_filter=1b) reserved 23:20 0x0 reserved. ipv6 27:24 0x0 ipv6. indicates if the ipv6 address filter registers (mipaf) contain valid ipv6 addresses. bit 24 corresponds to address 0, etc. bit 27 (filter 3) applies only when ipv4 address filters are not enabled (manc.en_ipv4_filter=0b) reserved 31:28 0x0 reserved. field bit(s) initial value 1 1. reset - the manc2h register is cleared on internal_power_on_reset and firmware reset. the initial values for this register ca n be loaded from the eeprom after power-up reset or firmware reset. description host enable 7:0 0x0 host enable. when set, indicates that packets forwarded by the manageability filters to manageability are also sent to the host. bit 0 corresponds to decision rule 0, etc. reserved 31:8 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 611 8.21.7 manageability decision filters- mdef (0x5890 + 4*n [n=0...7]; rw) where ?n? is the decision filter. field bit(s) initial value 1 1. default values are read from eeprom. description unicast and 0 0b unicast. controls the inclusion of unicast address filtering in the manageability filter decision (and section). broadcast and 1 0b broadcast. controls the inclusion of broadcast address filtering in the manageability filter decision (and section). vlan and 2 0b vlan. controls the inclusion of vlan address filtering in the manageability filter decision (and section). ip address 3 0b ip address - controls the inclusion of ip address filtering in the manageability filter decision (and section). unicast or 4 0b unicast - controls the inclusion of unicast address filtering in the manageability filter decision (or section). broadcast or 5 0b broadcast - controls the inclusion of broadcast address filtering in the manageability filter decision (or section). multicast and 6 0b multicast - controls the inclusion of multicast address filtering in the manageability filter decision (and section). broadcast packets are not included by this bit. the packet must pass some l2 filtering to be included by this bit ? either by the manc.mcst_pass_l2 or by some dedicated mac address. arp request 7 0b arp request - controls the inclusion of arp request filtering in the manageability filter decision (or section). arp response 8 0b arp response - controls the inclusion of arp response filtering in the manageability filter decision (or section). neighbor discovery 9 0b neighbor discovery - controls the inclusion of neighbor discovery filtering in the manageability filter decision (or section). the neighbor types accepted by this filter are types 0x86, 0x87, 0x88 and 0x89. port 0x298 10 0b port 0x298 - controls the inclusion of port 0x298 filtering in the manageability filter decision (or section). port 0x26f 11 0b port 0x26f - controls the inclusion of port 0x26f filtering in the manageability filter decision (or section). flex port 27:12 0x0 flex port - controls the inclusion of flex port filtering in the manageability filter decision (or section). bit 12 corresponds to flex port 0, etc. flex tco 31:28 0x0 flex tco - controls the inclusion of flex tco filtering in the manageability filter decision (or section). bit 28 corresponds to flex tco filter 0, etc.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 612 8.21.8 manageability decision filters- mdef_ext (0x5930 + 4*n[n=0...7]; rw) 8.21.9 manageability ip address filter - mipaf (0x58b0 + 4*n [n=0...15]; rw) the manageability ip address filter register stores ip addresses for manageability filtering. the mipaf register can be used in two configurations, depending on the value of the manc. en_ipv4_filter bit: ? en_ipv4_filter = 0: the last 128 bits of the register store a single ipv6 address (ipv6addr3) ? en_ipv4_filter = 1: the last 128 bits of the register store 4 ipv4 addresses (ipv4addr[3:0]) en_ipv4_filter = 0: field bit(s) initial value 1 1. default values are read from eeprom. description l2 ethertype and 3:0 0x0 l2 ethertype - controls the inclusion of l2 ethertype filtering in the manageability filter decision (and section). reserved 7:4 0x0 reserved for additional l2 ethertype and filters. l2 ethertype or 11:8 0x0 l2 ethertype - controls the inclusion of l2 ethertype filtering in the manageability filter decision (or section). reserved 15:12 0x0 reserved for additional l2 ethertype or filters. reserved 31:16 0x0 reserved. dword# address 31 0 0 0x58b0 ipv6addr0 1 0x58b4 2 0x58b8 3 0x58bc 4 0x58c0 ipv6addr1 5 0x58c4 6 0x58c8 7 0x58cc 8 0x58d0 ipv6addr2 9 0x58d4 10 0x58d8 11 0x58dc
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 613 field definitions for 0 setting: en_ipv4_filter = 1: 12 0x58e0 ipv6addr3 13 0x58e4 14 0x58e8 15 0x58ec field dword # address bit(s) initial value description ipv6addr0 0 0x58b0 31:0 x* ipv6 address 0, bytes 1-4 (l.s. byte is first on the wire). 1 0x58b4 31:0 x* ipv6 address 0, bytes 5-8. 2 0x58b8 31:0 x* ipv6 address 0, bytes 9-12. 3 0x58bc 31:0 x* ipv6 address 0, bytes 16-13. ipv6addr1 0 0x58c0 31:0 x* ipv6 address 1, bytes 1-4 (l.s. byte is first on the wire). 1 0x58c4 31:0 x* ipv6 address 1, bytes 5-8. 2 0x58c8 31:0 x* ipv6 address 1, bytes 9-12. 3 0x58cc 31:0 x* ipv6 address 1, bytes 16-13. ipv6addr2 0 0x58d0 31:0 x* ipv6 address 2, bytes 1-4 (l.s. byte is first on the wire). 1 0x58d4 31:0 x* ipv6 address 2, bytes 5-8. 2 0x58d8 31:0 x* ipv6 address 2, bytes 9-12. 3 0x58dc 31:0 x* ipv6 address 2, bytes 16-13. ipv6addr3 0 0x58e0 31:0 x* ipv6 address 3, bytes 1-4 (l.s. byte is first on the wire). 1 0x58e4 31:0 x* ipv6 address 3, bytes 5-8. 2 0x58e8 31:0 x* ipv6 address 3, bytes 9-12. 3 0x58ec 31:0 x* ipv6 address 3, bytes 16-13. dword# address 31 0 0 0x58b0 ipv6addr0 1 0x58b4 2 0x58b8 3 0x58bc
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 614 field definitions for 1 setting: 4 0x58c0 ipv6addr1 5 0x58c4 6 0x58c8 7 0x58cc 8 0x58d0 ipv6addr2 9 0x58d4 10 0x58d8 11 0x58dc 12 0x58e0 ipv4addr0 13 0x58e4 ipv4addr1 14 0x58e8 ipv4addr2 15 0x58ec ipv4addr3 field dword # address bit(s) initial value 1 1. the initial values for these registers can be loaded from the eeprom after power-up reset. the registers are written by the m c and not accessible to the host for writing. description ipv6addr0 0 0x58b0 31:0 x ipv6 address 0, bytes 1-4 (l.s. byte is first on the wire). 1 0x58b4 31:0 x ipv6 address 0, bytes 5-8. 2 0x58b8 31:0 x ipv6 address 0, bytes 9-12. 3 0x58bc 31:0 x ipv6 address 0, bytes 16-13. ipv6addr1 0 0x58c0 31:0 x ipv6 address 1, bytes 1-4 (l.s. byte is first on the wire). 1 0x58c4 31:0 x ipv6 address 1, bytes 5-8. 2 0x58c8 31:0 x ipv6 address 1, bytes 9-12. 3 0x58cc 31:0 x ipv6 address 1, bytes 16-13. ipv6addr2 0 0x58d0 31:0 x ipv6 address 2, bytes 1-4 (l.s. byte is first on the wire). 1 0x58d4 31:0 x ipv6 address 2, bytes 5-8. 2 0x58d8 31:0 x ipv6 address 2, bytes 9-12. 3 0x58dc 31:0 x ipv6 address 2, bytes 16-13. ipv4addr0 0 0x58e0 31:0 x ipv4 address 0 (l.s. byte is first on the wire). ipv4addr1 1 0x58e4 31:0 x ipv4 address 1 (l.s. byte is first on the wire). ipv4addr2 2 0x58e8 31:0 x ipv4 address 2 (l.s. byte is first on the wire). ipv4addr3 3 0x58ec 31:0 x ipv4 address 3 (l.s. byte is first on the wire).
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 615 initial value: reset - the registers are cleared on internal_power_on_reset only. note: these registers should be written in network order. 8.21.10 manageability mac address low - mmal (0x5910 + 8*n [n= 0...3]; rw) where ?n? is the exact unicast/multicast address entry, equal to 0,1,?3. these registers contain the lower bits of the 48 bit ethernet address. the mmal registers are written by the mc and are not accessible to the host for writing. the registers are used to filter manageability packets. see section 10.4 . reset - the mmal registers are cleared on internal_power_on_reset only. the initial values for this register can be loaded from the eeprom after power-up reset. note: the mmal.mmal field should be written in network order. 8.21.11 manageability mac address high - mmah (0x5914 + 8*n [n=0...3]; rw) where ?n? is the exact unicast/multicast address entry, equal to 0,1,?3. field bit(s) initial value 1 1. the initial values for these registers can be loaded from the eeprom after power-up reset. the registers are written by the m c and not accessible to the host for writing. description ip_addr 4 bytes 31:0 x 4 bytes of ip (v6 or v4) address: i mod 4 = 0 to bytes 1 ? 4 i mod 4 = 1 to bytes 5 ? 8 i mod 4 = 0 to bytes 9 ? 12 i mod 4 = 0 to bytes 13 ? 16 where i div 4 is the index of ip address (0...3). field bit(s) initial value 1 1. the initial values for these registers can be loaded from the eeprom after power-up reset. the registers are written by the m c and not accessible to the host for writing. description mmal 31:0 x manageability mac address low. the lower 32 bits of the 48 bit ethernet address. field bit(s) initial value 1 1. the initial values for these registers can be loaded from the eeprom after power-up reset. the registers are written by the m c and not accessible to the host for writing. description mmah 15:0 x manageability mac address high. the upper 16 bits of the 48 bit ethernet address. reserved 31:16 0x0 reserved. reads as 0. ignored on write.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 616 these registers contain the upper bits of the 48 bit ethernet address. the complete address is {mmah, mmal}. the mmah registers are written by the mc and are not accessible to the host for writing. the registers are used to filter manageability packets. see section 10.4 . reset - the mmal registers are cleared on internal_power_on_reset only. the initial values for this register can be loaded from the eeprom after power-up reset or firmware reset. note: the mmah.mmah field should be written in network order. 8.21.12 flexible tco filter table registers - ftft (0x09400- 0x097fc; rw) each of the 4 flexible tco filters table registers (ftft) contains a 128 byte pattern and a corresponding 128-bit mask array. if enabled, the first 128 bytes of the received packet are compared against the non-masked bytes in the ftft register. each 128b filter is composed of 32 dw entries, where each 2 dws are accompanied by an 8-bit mask, one bit per filter byte. the bytes in each 2 dws are written in network order i.e. byte0 written to bits [7:0], byte1 to bits [15:8] etc. the mask field is set so that bit 0 in the mask masks byte 0, bit 1 masks byte 1 etc. a value of 1 in the mask field means that the appropriate byte in the filter should be compared to the appropriate byte in the incoming packet. the last dw of each filter contains a length field defining the number of bytes from the beginning of the packet compared by this filter. if the actual packet length is less than the length specified by this field, the filter fails. otherwise, it depends on the result of the actual byte comparison. the length field should not be greater than 128 and not smaller than 8. note: the mask field and length field should be 8 bytes aligned. the packet length examined by the filter includes the 4 bytes of the crc, even if the crc is stripped. . . . 31 0 31 8 7 0 31 0 31 0 reserved reserved mask [7:0] dw 1 dw 0 reserved reserved mask [15:8] dw 3 dw 2 reserved reserved mask [23:16] dw 5 dw 4 reserved reserved mask [31:24] dw 7 dw 6 31 0 31 8 7 0 31 0 31 0 reserved reserved mask [127:120] dw 29 dw 28 length reserved mask [127:120] dw 31 dw 30
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 617 field definitions for filter table registers: the initial values for the ftft registers can be loaded from the eeprom after power-up reset. the ftft registers are written by the mc and are not accessible to the host for writing. the registers are used to filter manageability packets as described in section 10.4 . reset - the ftft registers are cleared on internal_power_on_reset only. 8.22 macsec register descriptions all the fields in the macsec registers reflecting parts of the packet header are represented in host ordering. 8.22.1 macsec tx capabilities register - lsectxcap (0xb000; ro) field dword address bit(s) initial value filter 0 dw0 0 0x09400 31:0 x filter 0 dw1 1 0x09404 31:0 x filter 0 mask[7:0] 2 0x09408 7:0 x reserved 3 0x0940c x filter 0 dw2 4 0x09410 31:0 x ? filter 0 dw30 60 0x094f0 31:0 x filter 0 dw31 61 0x094f4 31:0 x filter 0 mask[127:12 0] 62 0x094f8 7:0 x length 63 0x094fc 6:0 x field bit(s) initial value description nca 2:0 1b tx ca-supported. number of ca?s supported by the device. nsc 6:3 1b tx sc capable. number of sc?s supported by the device on the transmit data path. the 82576 supports twice as many sa?s as the tx sc for seamless re-keying (2 sa?s). reserved 15:7 0x0 reserved. lsectxsum 23:16 0x0 tx lsec key sum. a bit wise xor of the lsectxkey 0 bits and lsectxkey 1 bits. this register can be used by kay (the programming entity) to validate key programming. reserved 31:24 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 618 8.22.2 macsec rx capabilities register - lsecrxcap (0xb300; ro) 8.22.3 macsec tx control register - lsectxctrl (0xb004; rw) field bit(s) initial value description nca 2:0 1b rx ca-supported. number of ca?s supported by the device. nsc 6:3 0x1 rx sc capable. number of sc?s supported by the device on the receive data path. the 82576 supports twice as many sa?s as the rx sc for seamless re-keying (2 sa?s). reserved 15:7 0x0 reserved. rxlkm 23:16 0x0 rx lsec key sum. a bit wise xor of the rx macsec keys 0?1 as defined in registers lsecrxkey [n, m]. each byte is xored with the respective byte of the other keys. this register can be used by kay (the programming entity) to validate key programming. reserved 31:24 0x0 reserved. field bit(s) initial value description lstxen 1:0 00b enable tx macsec. enable tx macsec off loading. 00b = disable tx macsec (tx all packets w/o macsec offload). 01b = add integrity signature. 10b = encrypt and add integrity signature. 11b = reserved. when this field equals 00b (macsec offload is disabled). the ?tx untagged packet? register is not incremented for transmitted packets when the ?enable tx macsec? equals 00b. pnid 2 0b pn increase disabled 0b = normal operation 1b = pn is not incremented - used only for testability mode. reserved 4:3 00b reserved. aisci 5 1b always include sci. this field controls whether sci is explicitly included in the transmitted sectag. 0b ? false; 1b ? true. reserved 7:6 00b reserved. pntrh 31:8 11..1b pn exhaustion threshold. 24 msbits of the threshold over which hardware needs to interrupt kay to warn tx sa pn exhaustion and triggers a new sa renegotiation. bits 7:0 of the threshold are all 1b?s. note: unlike the lsectxpn0/1 registers, this field is stored in host ordering
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 619 8.22.4 macsec rx control register - lsecrxctrl (0xb304; rw) 8.22.5 macsec tx sci low - lsectxscl (0xb008; rw) 8.22.6 macsec tx sci high - lsectxsch (0xb00c; rw) field bit(s) initial value description reserved 1:0 00b reserved. lsrxen 3:2 00b enable rx macsec. controls the level of macsec packet filtering. 00b = disable rx macsec (passes all packets to host without macsec processing and no macsec header strip). 01b = check (execute macsec offload and post frame to host and mc even when it fails macsec operation unless failed icv and c bit was set). 10b = strict (execute macsec offload and post frame to host and mc only if it does not fail macsec operation). 11b = rx macsec drop (drops all packets that include macsec header). reserved 5:4 11b reserved. plsh 6 0b post macsec header. when set, the 82576 posts the macsec header and signature (icv) to host memory. during normal operation this bit should be cleared. note: when this bit is set vlan offload and other filter capabilities are disabled. rp 7 1b replay protect. enable replay protection. reserved 31:8 0x0 reserved. field bit(s) initial value description secyl 31:0 0b mac address secy low. the 4 ms bytes of the mac address copied to the sci field in the macsec header. this register is stored in network ordering. field bit(s) initial value description secyh 15:0 0b mac address secy high. the 2 ls bytes of the mac address copied to the sci field in the macsec header.this register is stored in network ordering. pi 31:16 0b port identifier. always zero for transmitted packets.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 620 8.22.7 macsec tx sa - lsectxsa (0xb010; rw) 8.22.8 macsec tx sa pn 0 - lsectxpn0 (0xb018; rw) as described in section 8.1.2.1 , the pn registers (lsectxpn0, lsectxpn1, and lsecrxsapn) are stored in network ordering. field bit(s) initial value description an0 1:0 0b an0 ? association number 0. this 2 bit field is posted to the an field in the transmitted macsec header when sa 0 is active. an1 3:2 0b an1 ? association number 1. this 2 bit field is posted to the an field in the transmitted macsec header when sa 1 is active. selsa 4 0b sa select (selsa). this bit selects between sa 0 or sa 1 smoothly (on a packet boundary). a value of ?0? selects sa 0 and a value of ?1? selects sa 1. actsa (ro) 5 0b active sa (actsa). this bit indicates the active sa. the actsa follows the value of the selsa on a packet boundary. the kay (the programming entity) can use this indication to retire the old sa. reserved 31:6 0x0 reserved. field bit(s) initial value description pn 31:0 0x0 pn ? packet number. this field is posted to the pn field in the transmitted macsec header when sa 0 is active. it is initialized by the kay at sa creation and then increments by 1 for each transmitted packet using this sa. packets should never be transmitted if the pn repeats itself. in order to protect against such event the hardware generates an lsecpn interrupt to kay when the pn reaches the exhaustion threshold as defined in the lsectxctrl register. there is additional level of defense against repeating the pn. the hardware never transmits packets after the pn reach a value of 0xff...ff. in order to guarantee it, the hardware clears the ?enable tx macsec? field in the lsectxctrl register if the pn is greater or equals to 0xff?ef.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 621 8.22.9 macsec tx sa pn 1 - lsectxpn1 (0xb01c; rw) 8.22.10 macsec tx key 0 - lsectxkey0 (0xb020 + 4*n [n=0...3]; wo) 8.22.11 macsec tx key 1 - lsectxkey1 (0xb030 + 4*n [n=0...3]; wo) field bit(s) initial value description pn 31:0 0x0 pn ? packet number. this field is posted to the pn field in the transmitted macsec header when sa 1 is active. it is initialized by the kay at sa creation and then increments by 1 for each transmitted packet using this sa. packets should never be transmitted if the pn repeats itself. in order to protect against such event the hardware generates an lsecpn interrupt to kay when the pn reaches the exhaustion threshold as defined in the lsectxctrl register. there is additional level of defense against repeating the pn. the hardware never transmits packets after the pn reach a value of 0xff...ff. in order to guarantee it, the hardware clears the ?enable tx macsec? field in the lsectxctrl register if the pn is greater or equals to 0xff?ef. field bit(s) initial value description lseck0 31:0 0x0 lsec key 0. transmit macsec key of sa 0. n=0 l sec key defines bits 31:0 of the tx macsec key n=1 l sec key defines bits 63:32 of the tx macsec key n=2 l sec key defines bits 95:64 of the tx macsec key n=3 l sec key defines bits 127:96 of the tx macsec key this field is wo for confidentiality protection. for data integrity check, hash value can read the lsectxsum field in the lseccap register. if for some reason a read request is aimed to this register a value of all zeros is returned. field bit(s) initial value description lseck1 31:0 0x0 lsec key 1. transmit macsec key of sa 1. n=0 lsec key defines bits 31:0 of the tx macsec key n=1 lsec key defines bits 63:32 of the tx macsec key n=2 lsec key defines bits 95:64 of the tx macsec key n=3 lsec key defines bits 127:96 of the tx macsec key this field is wo for confidentiality protection. for data integrity check, hash value can read the lsectxsum field in the lseccap register. if for some reason a read request is aimed to this register a value of all zeros is returned.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 622 8.22.12 macsec rx sci low - lsecrxscl (0xb3d0; rw) 8.22.13 macsec rx sci high - lsecrxsch (0xb3e0; rw) 8.22.14 macsec rx sa - lsecrxsa[n] (0xb310 + 4*n [n=0...1]; rw) the following registers relate to macsec receive sa context. there are 2 sa(s) in the receive data path defined as sa0 and sa1. the registers below with index n relates to the sa index. field bit(s) initial value description mal 31:0 0x0 mac address secy low. the 4 ms bytes of the mac address in the sci field of the incoming packet that are compared with this field for sci matching. comparison result is meaningful only if the sc bit in the tci header is set. this register is stored in network ordering. field bit(s) initial value description mah 15:0 0x0 mac address secy high. the 2 ls bytes of the mac address in the sci field of the incoming packet that are compared with this field for sci matching. comparison result is meaningful only if the sc bit in the tci header is set. this register is stored in network ordering. pi 31:16 0x0 port identifier. the port number in the sci field in the incoming packet that is compared with this field for sci matching. comparison result is meaningful only if the sc bit in the tci header is set. field bit(s) initial value description an 1:0 00b an ? association number. this field is compared with the an field in the tci field of the incoming packet for match. sav 2 0b sa valid. this bit is set or cleared by the kay to validate or invalidate the sa.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 623 8.22.15 macsec rx sa pn - lsecrxsapn (0xb330 + 4*n [n=0...1]; rw) 8.22.16 macsec rx key - lsecrxkey (0xb350 + 16*n [n=0...1] + 4*m (m=0...3); wo) frr (ro) 3 0b frame received. this bit is cleared when the sa valid (bit 2) transitions from 0 ? 1, and is set when a frame is received with this sa. when the frame received bit is set the retired bit of the other sa of the same sc is set. note that a single frame reception with the new sa is sufficient to retire the old sa since we assume the replay window is 0. retired (ro) 4 0b retired. when this bit is set the sa is invalid (retired). this bit is cleared when a new sa is configured by the kay (sa valid transition to 1). it is set to ?1? when a packet is received with the other sa of the same sc. note that a single frame reception with the new sa is sufficient to retire the old sa since we assume the replay window is 0. reserved 31:5 0x0 reserved. field bit(s) initial value description pn 31:0 0x0 pn ? packet number. this register holds the pn field of the next incoming packet that uses this sa. the pn field in the incoming packet must be greater or equal to the pn register. the pn register is set by kay at sa creation. it is updated by the hardware for each received packet using this sa to be received pn + 1. these registers are stored in network ordering. field bit(s) initial value description lseck 31:0 0x0 lsec key. receive macsec key of sa n, while n=0,1. m=0 lsec key defines bits 31:0 of the rx macsec key m=1 lsec key defines bits 63:32 of the rx macsec key m=2 lsec key defines bits 95:64 of the rx macsec key m=3 lsec key defines bits 127:96 of the rx macsec key this field is wo for confidentiality protection. for data integrity check, kay hash value can read the lsecrxsum field in the lseccap registers. if for some reason a read request is aimed to this register a value of all zeros is returned.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 624 8.22.17 macsec software/firmware interface- lswfw (0x8f14; ro) note: the access rules on this register are for the driver software. 8.22.18 macsec tx port statistics these counters are defined by spec as 64bit while implementing only 32 bit in the hardware. the kay must implement the 64 bit counter in software by polling regularly the hardware statistic counters. the hardware section of the statistics counter is cleared upon read action. 8.22.18.1 tx untagged packet counter - lsectxut (0x4300 ; rc) 8.22.18.2 encrypted tx packets count - lsectxpkte (0x4304; rc) field bit(s) initial value description lock macsec logic 0 0b lock macsec. 0b = host can access macsec registers. 1b = host can not access macsec registers. block host traffic 1 0b when set, all host traffic (tx and rx) is blocked. request macsec (sc) 2 0b when set, a message is sent to the bmc, requesting access to the macsec registers. release macsec (sc) 3 0b when set, a message is sent to the bmc, releasing ownership of the macsec registers. reserved 7:4 0x0 reserved. operating system status 8 0b set by the firmware to indicate the status of the macsec ownership: 0b = macsec owned by host (default) 1b = macsec owned by bmc reserved 31:9 0x0 reserved. field bit(s) initial value description upc 31:0 0x0 untagged packet cnt. increments for each transmitted packet that is transmitted with the ilsec bit cleared in the packet descriptor while ?enable tx macsec? field in the lsectxctrl register is either 01b or 10b. the kay must implement a 64 bit counter. it can do that by reading the lsectxut register regularly. field bit(s) initial value description epc 31:0 0x0 encrypted packet cnt. increments for each transmitted packet through the controlled port with e bit set (confiden tiality was prescribed for this packet by software/firmware).
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 625 8.22.18.3 protected tx packets count - lsectxpktp (0x4308; rc) 8.22.18.4 encrypted tx octets count - lsectxocte (0x430c; rc) 8.22.18.5 protected tx octets count - lsectxoctp (0x4310; rc) 8.22.19 macsec rx port statistic these counters are defined by spec as 64bit while implementing only 32 bit in the hardware. the kay must implement the 64 bit counter in software by regularly polling the hardware statistic counters. 8.22.19.1 macsec untagged rx packet count - lsecrxut (0x4314; rc) field bit(s) initial value description ppc 31:0 0x0 protected packet cnt. increments for each transmitted packet through the controlled port with e bit cleared (integrity only was prescribed for this packet by software/firmware). field bit(s) initial value description eoc 31:0 0x0 encrypted octet cnt. increments for each byte of user data through the controlled port with e bit set (confidentiality was prescribed for this packet by software/firmware). field bit(s) initial value description poc 31:0 0x0 protected octet cnt. increments for each byte of user data through the controlled port with e bit reset (integrity only was prescribed for this packet by software/firmware). field bit(s) initial value description upc 31:0 0x0 untagged packet cnt. increments for each packet received having no tag. increments only when ?enable rx macsec? field in the lsecrxctrl register is either 01b or 10b.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 626 8.22.19.2 macsec rx octets decrypted count - lsecrxocte (0x431c; rc) 8.22.19.3 macsec rx octets validated count - lsecrxoctp (0x4320 ; rc) 8.22.19.4 macsec rx packet with bad tag count - lsecrxbad (0x4324; rc) 8.22.19.5 macsec rx packet no sci count - lsecrxnosci (0x4328; rc) field bit(s) initial value description droc 31:0 0x0 decrypted rx octet cnt. the number of octets of user data recovered from received frames that were both integrity protected and encrypted. this includes the octets from sectag to icv not inclusive. these counts are incremented even if the user data recovered failed the integrity check or could not be recovered. field bit(s) initial value description rc 31:0 0x0 validated rx octet cnt. the number of octets of user data recovered from received frames that were integrity protected but not encrypted.this includes the octets from sectag to icv not inclusive. these counts are incremented even if the user data recovered failed the integrity check or could not be recovered. field bit(s) initial value description brpc 31:0 0x0 bad rx packet cnt. number of packets received having invalid tag. field bit(s) initial value description usrpc 31:0 0x0 no sci rx packet cnt. number of packets received having unrecognized sci and dropped due to that condition.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 627 8.22.19.6 macsec rx packet unknown sci count - lsecrxunsci (0x432c; rc) 8.22.20 macsec rx sc statistic register descriptions 8.22.20.1 macsec rx unchecked packets count - lsecrxunch (0x4330; rc) software/firmware needs to maintain the full sized register. 8.22.20.2 macsec rx delayed packets count - lsecrxdelay (0x4340; rc) software/firmware needs to maintain the full sized register. 8.22.20.3 macsec rx late packets count - lsecrxlate (0x4350 ; rc) software/firmware needs to maintain the full sized register. field bit(s) initial value description usrpc 31:0 0b unknown sci rx packet cnt. number of packets received with an unrecognized sci but still forwarded to the host. field bit(s) initial value description urpc 31:0 0x0 unchecked rx packet cnt. rx packet cnt. number of packets received with macsec encapsulation (sectag) while validateframes is disabled (lsecrxctrl bits 3:2 equal 00b).? field bit(s) initial value description drpc 31:0 0x0 delayed rx packet cnt. number of packets received and accepted for validation having failed replay-protection and replayprotect is false (lsecrxctrl bit 1 is zero). field bit(s) initial value description lrpc 31:0 0x0 late rx packet cnt. number of packets received and accepted for validation having failed replay-protection and replayprotect is true (lsecrxctrl bit 1 is 1b). in strict mode, these packets are dropped.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 628 8.22.21 macsec rx sa statistic register descriptions 8.22.21.1 macsec rx packet ok count - lsecrxok[n] (0x4360+ 4*n [n=0...1]; rc) 8.22.21.2 macsec rx invalid count - lsecrxinv[n] (0x4380+ 4*n [n=0...1]; rc) 8.22.21.3 macsec rx not valid count - lsecrxnv[n] (0x43a0 + 4*n [n=0...1]; rc) 8.22.21.4 macsec rx not using sa count - lsecrxnusa (0x43c0; rc) 8.22.21.5 macsec rx unused sa count - lsecrxunsa (0x43d0; rc) field bit(s) initial value description orpc 31:0 0x0 ok rx packet cnt. number of packets received that were valid (authenticated) and passed replay protection. field bit(s) initial value description icrpc 31:0 0x0 invalid rx packet cnt. number of packets received that were not valid (authentication failed) and were forwarded to host. field bit(s) initial value description icrpc 31:0 0b invalid rx packet cnt. number of packets received that were not valid (authentication failed) and were dropped. field bit(s) initial value description issrpc 31:0 0b invalid sa rx packet cnt. number of packets received that were associated with an sa that is not ?inuse? (no match on an or not valid or retired) and were dropped. field bit(s) initial value description issrpc 31:0 0b invalid sa rx packet cnt. number of packets received that were associated with an sa that is not ?inuse? (no match on an or not valid or retired) and where forwarded to host.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 629 8.23 ipsec registers description ipsec registers are owned by the pf in an iov mode. unlike the macsec case, there is no added value here to encrypt the sa contents when being read by software, because the sa contents is available in clear text from the system memory anyway, like for any ipsec flow handled in software. 8.23.1 ipsec control ? ipsctrl (0xb430; rw) 8.23.2 ipsec tx index - ipstxidx (0xb450; rw) 8.23.3 ipsec tx key registers - ipstxkey (0xb460 + 4*n [n = 0...3]; rw) defines the key value used as part of the tx sa. see section 7.9.2.5.1 for details field bit(s) initial value description tx_ipsec_e n 00b rw ipsec tx offload enable bit. when set, ipsec offload ability is enabled for tx path. when cleared, ipsec offload ability is disabled for tx path, regardless of the contents of the tx sa table. rx_ipsec_e n 10b rw ipsec rx offload enable bit. when set, ipsec offload ability is enabled for rx path. when cleared, ipsec offload ability is disabled for rx path, regardless of the contents of rx sa tables. delete_all 2 0b rw/sc delete all bit. when set, the hardware invalidates all the rx table entries. this bit can be set by sw only if ipsrxcmd.busy bit was read as cleared before. the delete all bit is cleared when hardware ends deleting all the entries. ipsec_frag 3 0b avoid ipsec offload on ip fragments. 0 = no ipsec offload is done on ip fragments. correct operating mode. 1 = ipsec offload is done even on ip fragments. reserved 31:4 0x0 reserved. field bit(s) initial value description sa_idx 7:0 0x0 sa index for indirect access into the tx sa table. reserved 31:8 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 630 8.23.4 ipsec tx salt register - ipstxsalt (0xb454; rw) defines the salt value used as part of the tx sa. see section 7.9.2.5.1 for details. 8.23.5 ipsec rx command register - ipsrxcmd (0xb408; rw) field bit(s) initial value description aes-128 key 31:0 0x0 4 bytes of 16 bytes key that has been read/written from/into the tx sa entry pointed by sa_idx. n=0 contains the lsb of the key. n=3 contains the msb of the key. any write in this register must be followed by a write in the ipstxsalt register, as it is the trigger for internal write of the whole entry into the tx sa table. field bit(s) initial value description aes-128 salt 31:0 0x0 4 bytes salt that has been read/written from/into the tx sa entry pointed by sa_idx. writing this register is used internally to trigger the write of the whole entry into the tx sa table, and should thus be written last whenever updating a tx sa entry. field bit(s) initial value description reserved 1:0 00b reserved. proto 2 0b ipsec protocol select. when set this sa offloads esp packets. when reset this sa offloads ah packets. decrypt 3 0b when set hardware performs decrypting offload on this sa. meaningful only if proto is set (esp mode). ipv6 4 0b ipv6 type. when set this sa expects to receive an ipv6 packet. when cleared, hardware forces the ipv6 address bit ipsrxipaddr 0..2 to 0 and the look up is based only on the high order 32 bits (ipsrxipaddr3). this bit is the ls-bit in the search key. when software adds an entry it must write all the ipsrxipaddr 0...3 in any case (ipv6 or ipv4). in case of an ipv4 address, ipsrxipaddr 0..2 must be set to 0. reserved 7:5 000b reserved. used_sa (ro) 16:8 0x0 number of used sa 0...256. if equal to 0 delete commands is ignored. if equal to 256 add commands is ignored.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 631 notes: if software reads the busy bit as set, any ipsec register write has no effect, and software should continue polling the ipsrxcmd.busy bit (read access) until it is read as cleared. software should not make changes in the tx sa table while changing the ipsec_en bit. 8.23.6 ipsec rx spi register - ipsrxspi (0xb40c; rw) 8.23.7 ipsec rx key register - ipsrxkey (0xb410 + 4 * n [n = 0..3]; rw) 8.23.8 ipsec rx salt register - ipsrxsalt (0xb404; rw) reserved 29:17 0x0 reserved. add_del 30 0b add or delete command. when set hardware adds the sa information to the rx sa table. when clear the hardware deletes the sa that matches the ipsrxspi, the ipsrxaddr 0...3 registers, and the ipv6 type bit. busy (ro/sc) 31 0b busy bit. used by hardware to lock software access to the sa table while the hardware is in adding or deleting process. the busy bit is automatically set by hardware when software writes to this register. the busy bit is cleared when hardware ends adding or deleting an sa. field bit(s) initial value description spi 31:0 0x0 spi field that has been deleted/added from/into the rx sa table. note: field is defined in big endian (ms byte is first on the wire) field bit(s) initial value description aes-128 key 31:0 0x0 4 bytes of 16 bytes key that has been deleted/added from/ into the rx sa table. n=0 contains the lsb of the key. n=3 contains the msb of the key. field bit(s) initial value description aes-128 salt 31:0 0x0 4 bytes salt that has been deleted/added from/into the rx sa table.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 632 8.23.9 ipsec rx ip address register - ipsrxipaddr (0xb420 + 4*n [n = 0..3]; rw) 8.23.10 ipsec rx index - ipsrxidx (0xb400; rw) 8.24 diagnostic registers description the 82576 contains several diagnostic registers. these registers enable software to directly access the contents of the 82576?s internal packet buffer memory (pbm), also referred to as fifo space. these registers also give software visibility into what locations in the pbm that the hardware currently considers to be the ?head? and ?tail? for both transmit and receive operations. 8.24.1 receive data fifo head register - rdfh (0x02410; rws) this register stores the head of the on?chip receive data fifo. since the internal fifo is organized in units of 64-bit words, this field contains the 64-bit offset of the current receive fifo head. so a value of ?0x8? in this register corresponds to an offset of 8 quadwords into the receive fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. field bit(s) initial value description ipaddr 31:0 0x0 4 bytes of 16 bytes destination ip address for the associated rx sa(s) that has been deleted/added from/into the rx sa table. n=0 contains the msb of an ipv6 address. n=3 contains the lsb of an ipv6 address or an ipv4 address. for an ipv4 address, ipsrxipaddr 0...2 shall be written by sw with zeros. note: field is defined in big endian (ls byte is first on the wire). field bit(s) initial value description sa_idx 7:0 0x0 sa index for indirect access into the rx sa table. reserved 30:8 0x0 reserved dbg_mod 31 0b debugging read access mode. when set, the rx sa entries can be read via setting the sa_idx field to the sa entry index to be read. when cleared, the normal rx sa table access mode is used. this bit must be set only while the busy bit of the ipsrxcmd register was read as cleared. field bit(s) initial value description
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 633 8.24.2 receive data fifo tail register - rdft (0x02418; rws) this register stores the tail of the on?chip receive data fifo. since the internal fifo is organized in units of 64-bit words, this field contains the 64-bit offset of the current receive fifo tail. so a value of ?0x8? in this register corresponds to an offset of eight quadwords or into the receive fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. 8.24.3 receive data fifo head saved register - rdfhs (0x2420; rws) this register stores a copy of the receive data fifo head register in case the internal register needs to be restored. this register is available for diagnostic purposes only, and should not be written during normal operation. 8.24.4 receive data fifo tail saved register - rdfts (0x02428; rws) this register stores a copy of the receive data fifo tail register in case the internal register needs to be restored. this register is available for diagnostic purposes only, and should not be written during normal operation. fifo 0 head 12:0 0b receive fifo 0 head pointer. thisfield refers to the whole rx packet buffer. note: the field is in units of 64-bit lines. reserved 30:13 0b reads as 0b. should be written to 0b for future compatibility. fifo 0 full 31 0b rx memory full signal - this bit rises when there are less than 4 empty rows in the rx packet buffer. this bit indicates the status of the whole rx packet buffer. field bit(s) initial value description fifo tail 0 12:0 0b receive fifo buffer tail pointer. this field refers to the whole rx packet buffer. reserved 31:13 0b reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description fifo head 0 12:0 0b a ?saved? value of the receive fifo head pointer . this field refers to the whole rx packet buffer. note: the field is in units of 64-bit lines. reserved 31:15 0b reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description fifo tail 0 12:0 0b rx read desc pointer. this field refers to the whole rx packet buffer. note: the field is in units of 64-bit lines. reserved 31:15 0b reads as 0b. should be written to 0b for future compatibility.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 634 8.24.5 switch buffer fifo head register - swbfh (0x03010; rws) this register stores the head pointer of the on-chip switch data fifo. since the internal fifo is organized in units of 64 bit words, this field contains the 64 bit offset of the current switch fifo head. so a value of ?0x8? in this register corresponds to an offset of 8 qwords or 64 bytes into the switch fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. 8.24.6 switch buffer fifo tail register - swbft (0x03018; rws) this register stores the tail pointer of the on-chip switch data fifo. since the internal fifo is organized in units of 64 bit words, this field contains the 64 bit offset of the current switch fifo tail. so a value of ?0x8? in this register corresponds to an offset of 8 qwords or 64 bytes into the switch fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. 8.24.7 switch buffers fifo head saved register - swbfhs (0x03020; rws) field bit(s) initial value description fifo 0 head 12:0 0b switch fifo 0 head pointer. this field refers to the whole switch packet buffer. note: the field is in units of 64-bit lines. reserved 30:13 0b should be written to 0b for future compatibility. fifo 0 full 31 0b switch memory full signal - this bit rises when there are less than 4 empty rows in the rx packet buffer. this bit indicates the status of the whole switch packet buffer. field bit(s) initial value description fifo tail 0 12:0 0b switch fifo tail pointer. this field refers to the whole switch packet buffer. reserved 31:13 0b reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description fifo head 0 12:0 0b a ?saved? value of the switch fifo head pointer. this field refers to the whole switch packet buffer. note: the field is in units of 64-bit lines. reserved 31:13 0b reads as 0b. should be written to 0b for future compatibility.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 635 8.24.8 switch buffers fifo tail saved register - swbfts (0x03028; rws) 8.24.9 packet buffer diagnostic - pbdiag (0x02458; r/w) 8.24.10 transmit data fifo head register - tdfh (0x03410; rws) this register stores the head of the on?chip transmit data fifo. since the internal fifo is organized in units of 64-bit words, this field contains the 64-bit offset of the current transmit fifo head. a value of 0x8 in this register corresponds to an offset of 8 quadwords into the transmit fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. field bit(s) initial value description fifo tail 0 12:0 0b switch read desc pointer0. this field refers to the whole switch packet buffer. note: the field is in units of 64-bit lines. reserved 31:13 0b reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description bypass mode 0 0b descriptor monitor bypass. when this bit is set to 1b the descriptor monitor (checking if there is enough descriptors in q ring) is disabled. pb_mng 1 0b packet buffer for manageability. when set to 1b all rx traffic is not written to the packet buffer so that the packet buffer could be used as memory for manageability controller code. reserved 19:2 0x0 reserved. dbu_empty (ro) 20 x all fifos (rx and tx) are empty. cfg_rx_wait 21 0b stop reading data from the receive data buffer to the dma rx machine. diagnostic only. cfg_tx_wait 22 0b stop reading data from the transmit data buffer towards the tx mac. diagnostic only reserved 28:23 00b reserved. always set to 00b. stat_sel 31:29 0x0 select the statistics reflected in dbgc_1 to dbgc_3 field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 636 8.24.11 transmit data fifo tail register - tdft (0x03418; rws) this register stores the tail of the on?chip transmit data fifo. since the internal fifo is organized in units of 64-bit words, this field contains the 64-bit offset of the current transmit fifo tail. a value of 0x8 in this register corresponds to an offset of 8 quadwords into the transmit fifo space. this register is available for diagnostic purposes only, and should not be written during normal operation. 8.24.12 transmit data fifo head saved register - tdfhs (0x03420; rws) this register stores a copy of the transmit data fifo head register in case the internal register needs to be restored. this register points to the beginning of the last packet in the packet buffer, even if it was already transmitted. this register is available for diagnostic purposes only, and should not be written during normal operation. 8.24.13 transmit data fifo tail saved register - tdfts (0x03428; rws) this register stores a copy of the transmit data fifo tail register in case the internal register needs to be restored. this register is available for diagnostic purposes only, and should not be written during normal operation. fifo head 0 12:0 0b transmit fifo head pointer. this field refers to the whole tx packet buffer. note: the field is in units of 64-bit lines. reserved 30:13 0x0 reads as 0b. should be written to 0b for future compatibility. tx memory full 0 31 0b tx fifo memory full indication. this bit rises when there are less than 4 empty rows in the tx packet buffer. this bit indicates the status of the whole tx packet buffer. field bit(s) initial value description fifo tail 0 12:0 0x0 transmit fifo tail pointer. this field refers to the whole tx packet buffer. note: the field is in units of 64-bit lines. reserved 30:13 0x0 reads as 0b. should be written to 0b for future compatibility. field bit(s) initial value description fifo head 0 12:0 0x0 transmit fifo last packet header pointer. field refers to the whole tx packet buffer. note: the field is in units of 64-bit lines reserved 31:13 000b reads as 000b. should be written to 0b for future compatibility.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 637 8.24.14 transmit data fifo packet count - tdfpc (0x03430; ro) this register reflects the number of packets to be transmitted that are currently in the transmit fifo. this register is available for diagnostic purposes only, and should not be written during normal operation. 8.24.15 receive data fifo packet count - rdfpc (0x02430; ro) this register reflects the number of packets to be received that are currently in the receive fifo. this register is available for diagnostic purposes only, and should not be written during normal operation. 8.24.16 switch data fifo packet count - swdfpc (0x03030; ro) this register reflects the number of packets to be received that are currently in the switch fifo. this register is available for diagnostic purposes only, and should not be written during normal operation. field bit(s) initial value description fifo tail 0 12:0 0x0 tx read desc pointer. this field refers to the whole rx packet buffer. note: the field is in units of 64-bit lines. reserved 31:13 000b reads as 000b. should be written to 0b for future compatibility. field bit(s) initial value description tx fifo packet count 0 11:0 0x0 the number of packets to be transmitted that are currently in the tx fifo . this reflects the number of packets in the full packet buffer. reserved 31:13 0000b reads as 0000b. field bit(s) initial value description rx fifo packet count 0 11:0 0x0 the number of packets to be received that are currently in therx fifo . this reflects the number of packets in the full packet buffer. reserved 31:12 0000b reads as 0000b. field bit(s) initial value description software fifo packet count 0 11:0 0x0 the number of packets to be received that are currently in theswitch fifo . this reflects the number of packets in the full packet buffer. reserved 31:12 0000b reads as 0000b.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 638 8.24.17 ipsec packet buffer ecc status - ippbeccsts (0xb470; rc) 8.24.18 pb slave access control - pbslac (0x3100; rw) all pbm (fifo) data is available to diagnostics. locations are accessed as 128-bit words using the pbslac & pbslad registers. field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count. this counter is incremented every time a correctable error is detected; the counter stops after reaching 0xff. these bits are cleared by reads. uncorr_err_c nt 15:8 0x0 uncorrectable error count. this counter is incremented every time an uncorrectable error is detected; the counter stops after reaching 0xff. these bits are cleared by reads. ecc enable (rw) 16 1b ecc enable for packet buffer. reserved 25:17 0b reserved. pb_cor_err_s ta 26 0b status of pb correctable error. this bit is cleared by a read. pb_uncor_err _sta 27 0b status of pb uncorrectable error. this bit is cleared by a read. reserved 31:28 0b reserved. field bit(s) initial value description reserved 3:0 0x0 reserved. must be written with 0. addr 15:4 0x0 address. software sets the address in which to access the chosen memory. aligned for 16 bytes. reserved 18:16 0x0 reserved. mem_sel 20:19 0x0 memory select. 00 ? rx pb 01 ? tx pb 10 ? switch pb 11 - reserved reserved 29:21 0x0 reserved. rd_req 30 0x0 read request. the software sets this bit when asking to read data. hardware clears this bit when data is ready. reading can be done during traffic. wr_req 31 0x0 write request. the software sets this bit when asking to write data. hardware clears this bit when data is ready. writing should not be done during traffic.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 639 8.24.19 pb slave access data ? pbslad (0x3110 + 4*n [n= 0...3]; rw) 8.24.20 rx descriptor handler memory - rdhm (0x06000 + 4*n [n= 0..1023]; ro) all the rx descriptor handler cache is available to diagnostics. locations can be accessed as 32 bit or 64 bit words. the descriptor handler cache is 4kb in size. the cache is divided to 4 queues of 1k each. the access to the memory is linear. for example, descriptor n (31:0) in rx queue x (15:0) is located at address 0x06000 + x * 0x200 + n * 0x10. the packet buffer is accessible by pages of 4k. the accessed page is set in the rdhmp register. 8.24.21 rx descriptor handler memory page number - rdhmp (0x025fc; rw) note: the queue depth field must be updated before the receive queues are enabled (before writing to any csr that controls the queues). field bit(s) initial value description data 31:0 0x0 data. packet buffer rd/wr data n = 0: bits [31:0] n = 1: bits [63:32] n = 2: bits [95:64] n = 3: bits [127:96] field bit(s) initial value description fifo data 31:0 x rx descriptor handler data field bit(s) initial value description page 3:0 0x0 rx descriptor handler accessed page (4kb) valid values are 0 or 1. reserved 29:4 0x0 reserved. queue depth 31:30 00b defines the number of descriptors (per queue) in the cache. 00b = 32 descriptors 01b = 16 descriptors 10b = 8 descriptors 11b = 4 descriptors
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 640 8.24.22 tx descriptor handler memory - tdhm (0x07000 + 4*n [n= 0..1023]; ro) all the tx descriptor handler cache is available to diagnostics. locations can be accessed as 32 bit or 64 bit words. the descriptor handler cache is 4kb in size. the cache is divided to 4 queues of 1k each. the access to the memory is linear. for example, descriptor n (31:0) in tx queue x (15:0) is located at address 0x07000 + x * 0x200 + n * 0x10. the packet buffer is accessible by pages of 4k. the accessed page is set in the rdhmp register. 8.24.23 tx descriptor handler memory page number - tdhmp (0x035fc; r/w) note: the queue depth field must be updated before the receive queues are enabled (before writing to any csr that controls the queues). the pbtcwbcol, pbtccomp, pbtcwbd, pbtcwbrs, pbtcwbtail bits enable performance of internal descriptor handler operations in parallel to the software tail bumping process. thus avoiding a performance impact when the tail is bumped frequently. field bit(s) initial value description fifo data 31:0 x tx descriptor handler data. field bit(s) initial value description page 3:0 0x0 tx descriptor handler accessed page (4kb) valid values are 0 or 1. reserved 29:4 0x0 reserved. queue depth 31:30 00b defines the number of descriptors (per queue) in the cache. 00b = 32 descriptors 01b = 16 descriptors 10b = 8 descriptors 11b = 4 descriptors
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 641 8.24.24 rx packet buffer ecc status - rpbeccsts (0x0245c; rc) note: header replication, header split, or packets replication may cause the respective counters to count more than once. 8.24.25 tx packet buffer ecc status - tpbeccsts (0x0345c; rc) field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count this counter is incremented every time a correctable error is detected; the counter stops after reaching 0xff. these bits are cleared by reads. uncorr_err_c nt 15:8 0x0 uncorrectable error count this counter is incremented every time an uncorrectable error is detected; the counter stops after reaching 0xff. these bits are cleared by reads. note that this counter might count the same error more than once if the data is read multiple time. ecc enable (rw) 16 1b ecc enable for packet buffer. reserved 25:17 0x0 reserved. pb_cor_err_s ta 26 0b status of pb correctable error. this bit is cleared by a read. pb_uncor_err _ sta 27 0b status of pb uncorrectable error. this bit is cleared by a read. reserved 31:28 0x0 reserved. field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count. this counter is incremented every time a correctable error is detected; the counter stops after reaching 0xff. these bits are cleared by reads. uncorr_err_c nt 15:8 0x0 uncorrectable error count. this counter is incremented every time an uncorrectable error is detected; the counter stops after reaching 0xff. these bits are cleared by reads. ecc enable (rw) 16 1b ecc enable for packet buffer. reserved 25:17 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 642 note: uncorrectable errors occuring in the tx packet buffer may propagate to the switch buffer. 8.24.26 switch packet buffer ecc status - swpbeccsts (0x0305c; rc) 8.24.27 ipsec packet buffer ecc error inject - ippbeei (0xb474; rw) pb_cor_err_s ta 26 0b status of pb correctable error. this bit is cleared by a read. pb_uncor_err _ sta 27 0b status of pb uncorrectable error. this bit is cleared by a read. reserved 31:28 0x0 reserved. field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count. this counter is incremented every time a correctable error is detected; the counter stops after reaching 0xff. these bits are cleared by reads. uncorr_err_c nt 15:8 0x0 uncorrectable error count. this counter is incremented every time an uncorrectable error is detected; the counter stops after reaching 0xff. these bits are cleared by reads. note that this counter might count errors propagated from the tx buffer and might count the same error more than once if the data is read multiple time. ecc enable (rw) 16 1b ecc enable for packet buffer. reserved 25:17 0x0 reserved pb_cor_err_s ta 26 0b status of pb correctable error. this bit is cleared by a read. pb_uncor_err _ sta 27 0b status of pb uncorrectable error. this bit is cleared by a read. reserved 31:28 0x0 reserved. field bit(s) initial value description write injection enable 0 0b write injection enable. when this bit is set, an error is injected the next time a line is written to the packet buffer. this bit is auto cleared by hardware when an error is injected. reset data 1 0b reset data clears all bits in the data line on which the error is inserted.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 643 8.24.28 rx descriptor handler ecc status - rdhests (0x025c0; rc) note: a single error may cause the relevant counter to increase more than once. 8.24.29 tx descriptor handler ecc status - tdhests (0x35c0; rc) note: a single error may cause the relevant counter to increase more than once. reserved 15:2 0x0 reserved. error1 bit location 23:16 0xff no error injection on this bit maximum allowed value is 74 for error injection. error2 bit location 31:24 0xff no error injection on this bit maximum allowed value is 74 for error injection. field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count. this counter is incremented every time a correctable error is detected in the rx descriptor handler memory; the counter stops after reaching 0xff. these bits are cleared by reads. uncorr_err_c nt 15:8 0x0 uncorrectable error count. this counter is incremented every time an uncorrectable error is detected in the rx descriptor handler memory; the counter stops after reaching 0xff. these bits are cleared by reads. rdhecc enable (rw) 16 1b rx descriptor handler ecc enable. reserved 31:17 0x0 reserved. field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count. this counter is incremented every time a correctable error is detected in the tx descriptor handler memory; the counter stops after reaching 0xff. these bits are cleared by reads. uncorr_err_c nt 15:8 0x0 uncorrectable error count. this counter is incremented every time an uncorrectable error is detected in the tx descriptor handler memory; the counter stops after reaching 0xff. these bits are cleared by reads. tdhecc enable (rw) 16 1b tx descriptor handler ecc enable. reserved 31:17 0x0 reserved.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 644 8.24.30 pcie retry buffer ecc status - prbests (0x05ba0; rc) this register is shared between the lan ports. 8.24.31 pcie write buffer ecc status - pwbests (0x05bb0; rc) 8.24.32 pcie msi-x ecc status - pmsixests (0x05ba8; rc) this register is shared between lan ports. field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count. this counter is incremented every time a correctable error is detected in the pcie retry buffer memory; the counter stops after reaching 0xff. these bits are cleared by reads. reserved 15:8 0x0 reserved. prbecc enable (rw) 16 1b pcie retry buffer ecc enable. reserved 31:17 0x0 reserved. field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count. this counter is incremented every time a correctable error is detected in the pcie write buffer memory; the counter stops after reaching 0xff. these bits are cleared by reads. reserved 15:8 0x0 reserved. pwbecc enable (rw) 16 1b pcie write buffer ecc enable. reserved 31:17 0x0 reserved. field bit(s) initial value description corr_err_cnt 7:0 0x0 correctable error count. this counter is incremented every time a correctable error is detected in the pcie msi-x vector memory; the counter stops after reaching 0xff. these bits are cleared by reads. reserved 15:8 0x0 reserved. pmsixe enable (rw) 16 1b pcie msi-x memory ecc enable reserved 31:17 0x0 reserved.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 645 8.24.33 parity and ecc error indication- peind (0x1084; rc) field bit(s) initial value description dbu_ecc_sw_pb_nfatal 0 0b non fatal error detected in the data part of a packet in the switch packet buffer memory. dbu_ecc_tx_pb_nfatal 1 0b non fatal error detected in the data part of a packet in the tx packet buffer memory . dbu_ecc_rx_pb_nfatal 2 0b non fatal error detected in the data part of a packet in the rx packet buffer memory. reserved 7:3 0x0 reserved. dtx_parity_temp_fatal 8 0b fatal error detected in tso prototype header memory. mac_parity_secu_rx_sa_fat al 9 0b fatal error detected in ipsec rx keys memory. mac_parity_secu_tx_sa_fat al 10 0b fatal error detected in ipsec tx keys memory. drx_parity_desc_fifo_fatal 11 0b fatal error detected in internal received packets descriptor memory. dtx_parity_dpt2_fatal 12 0b fatal error detected in internal transmit packets descriptor memory. dtx_parity_hdr_fatal 13 0b fatal error detected in internal transmit packets descriptor memory. dbu_parity_drx_pb_reg_fat al 14 0b fatal error detected in rx packet buffer registers file. dtx_parity_dh_reg_fatal 15 0b fatal error detected in tx descriptor handler registers file. drx_parity_dh_reg_fatal 16 0b fatal error detected in rx descriptor handler registers file. dtx_parity_lso_fatal 17 0b fatal error detected in tso internal tx context memory. dtx_parity_cntxt_fatal 18 0b fatal error detected in internal tx context memory. xtx_parity_buf_stg_fatal 19 0b fatal error detected in xtx internal fifo. mac_parity_secu_post_l4cs _fatal 20 0b fatal error detected in ipsec checksum rx fifo. mac_parity_secu_pre_hdr_ parse_fatal 21 0b fatal error detected in ipsec pre-processing rx fifo. mac_parity_secu_post_sta _fatal 22 0b fatal error detected in ipsec post processing rx fifo. ghost_parity_desc_rd_fatal 23 0b fatal error detected in descriptor completion buffer. ghost_parity_data_rd_fatal 24 0b fatal error detected in data completion buffer. drx_ecc_icache_fatal 25 0b fatal error detected in rx descriptor handler icache. dtx_ecc_icache_fatal 26 0b fatal error detected in tx descriptor handler icache. dbu_ecc_sw_pb_fatal 27 0b fatal error detected in the header part of a packet in the switch packet buffer memory. dbu_ecc_tx_pb_fatal 28 0b fatal error detected in the header part of a packet in the tx packet buffer memory.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 646 8.24.34 parity and ecc indication mask ? peindm (0x1088; rw) dbu_ecc_rx_pb_fatal 29 0b fatal error detected in the header part of a packet in the rx packet buffer memory. mac_ecc_secu_tx_pb_fatal 30 0b fatal error detected in the tx ipsec packet buffer memory. mem_fault_port_hang (ro) 31 0b indicates a hang condition due to a fatal parity or ecc error in one of the memories. field bit(s) initial value description dbu_ecc_sw_pb_nfatal 0 0b enable impact of error detected in the data part of a packet in the switch packet buffer memory. dbu_ecc_tx_pb_nfatal 1 0b enable impact of error detected in the data part of a packet in the tx packet buffer memory. dbu_ecc_rx_pb_nfatal 2 0b enable impact of error detected in the data part of a packet in the rx packet buffer memory. reserved 7:3 0x0 reserved. dtx_parity_temp_fatal 8 0b enable impact of error detected in tso prototype header memory. mac_parity_secu_rx_sa_fat al 9 0b enable impact of error detected in ipsec rx keys memory. mac_parity_secu_tx_sa_fat al 10 0b enable impact of error detected in ipsec tx keys memory. drx_parity_desc_fifo_fatal 11 0b enable impact of error detected in internal received packets descriptor memory. dtx_parity_dpt2_fatal 12 0b enable impact of error detected in internal transmit packets descriptor memory. dtx_parity_hdr_fatal 13 0b enable impact of error detected in internal transmit packets descriptor memory. dbu_parity_drx_pb_reg_fat al 14 0b enable impact of error detected in rx packet buffer registers file. dtx_parity_dh_reg_fatal 15 0b enable impact of error detected in tx descriptor handler registers file. drx_parity_dh_reg_fatal 16 0b enable impact of error detected in rx descriptor handler registers file. dtx_parity_lso_fatal 17 0b enable impact of error detected in tso internal tx context memory. dtx_parity_cntxt_fatal 18 0b enable impact of error detected in internal tx context memory. xtx_parity_buf_stg_fatal 19 0b enable impact of error detected in xtx internal fifo. mac_parity_secu_post_l4cs _fatal 20 0b enable impact of error detected in ipsec checksum rx fifo. mac_parity_secu_pre_hdr_ parse_fatal 21 0b enable impact of error detected in ipsec pre-processing rx fifo. mac_parity_secu_post_sta _fatal 22 0b enable impact of error detected in ipsec post processing rx fifo.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 647 8.24.35 tx dma performance burst and descriptor count - txbdc (0x35e0; rc) 8.24.36 tx dma performance idle count - txidle (0x35e4; rc) note: reading this register will clear this internal txcycle counter. this counter will then count for a period of 64k cycles. when this period is over the counters in txbdc and this registers will freeze. in order to read the results of the measurements, the txbdc register must be read first. ghost_parity_desc_rd_fatal 23 0b enable impact of error detected in descriptor completion buffer. ghost_parity_data_rd_fatal 24 0b enable impact of error detected in data completion buffer. drx_ecc_icache_fatal 25 0b enable impact of error detected in rx descriptor handler icache. dtx_ecc_icache_fatal 26 0b enable impact of error detected in tx descriptor handler icache. dbu_ecc_sw_pb_fatal 27 0b enable impact of error detected in the header part of a packet in the switch packet buffer memory. dbu_ecc_tx_pb_fatal 28 0b enable impact of error detected in the header part of a packet in the tx packet buffer memory. dbu_ecc_rx_pb_fatal 29 0b enable impact of error detected in the header part of a packet in the rx packet buffer memory. mac_ecc_secu_tx_pb_fatal 30 0b enable impact of error detected in the tx ipsec packet buffer memory. parity enable 31 1b enable parity in all the parity protected memories. field bit(s) initial value description burstcnt 15:0 0x0 the counter is counting the transitions from idle to burst state as long as the internal txcycle counter does not equal 0xffff. the counter is clear on read. desccnt 31:1 6 0x0 the counter is counting the number of descriptors fetched as long as the internal txcycle counter does not equal 0xffff. the counter is clear on read. field bit(s) initial value description singlecnt 15:0 0x0 the counter is counting the transitions from idle to single state as long as the internal txcycle counter does not equal 0xffff. the counter is clear on read. idlecnt 31:1 6 0x0 the counter is counting the cycles on which the txdma is in idle for more then one cycle as long as the internal txcycle counter does not equal 0xffff. the counter is clear on read.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 648 8.24.37 rx dma performance burst and descriptor count - rxbdc (0x25e0; rc) 8.24.38 rx dma performance idle count - rxidle (0x25e4; rc) note: reading this register will clear this internal rxcycle counter. this counter will then count for a period of 64k cycles. when this period is over the counters in rxbdc and this registers will freeze. in order to read the results of the measurements, the rxbdc register must be read first. 8.25 phy software interface (phyreg) ? base registers (0 through 10 and 15) are defined in accordance with the ?reconciliation sub layer and media independent interface? and ?physical layer link signaling for 10/100/ 1000 mb/s auto- negotiation? sections of the ieee 802.3. ? additional registers (phyreg.16 through 28) are defined in accordance with the ieee 802.3 specification for adding unique chip functions. note: the phy register bit descriptions are in table 8-26 . phyreg 26 is defined as a secure register. this means that software attempts to access this register is blocked by mac, only internal hardware accesses (for example, firmware) are enabled. field bit(s) initial value description burstcnt 15:0 0x0 the counter is counting the transitions from idle to burst state as long as the internal rxcycle counter does not equal 0xffff. the counter is clear on read. desccnt 31:1 6 0x0 the counter is counting the number of descriptors fetched as long as the internal rxcycle counter does not equal 0xffff. the counter is clear on read. field bit(s) initial value description singlecnt 15:0 0x0 the counter is counting the transitions from idle to single state as long as the internal rxcycle counter does not equal 0xffff. the counter is clear on read. idlecnt 31:1 6 0x0 the counter is counting the cycles on which the txdma is in idle for more then one cycle as long as the internal rxcycle counter does not equal 0xffff. the counter is clear on read. table 8-26. table of phyreg registers offset abbreviation name rw link to page 00d pctrl phy control register r/w page 650 01d pstatus phy status register r page 651 02d phy id 1 phy identifier register 1 (lsb) r page 652 03d phy id 2 phy identifier register 2 (msb) r page 652 04d ana auto?negotiation advertisement register r/w page 652
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 649 05d auto?negotiation base page ability register r page 653 06d ane auto?negotiation expansion register r page 654 07d npt auto?negotiation next page transmit register r/w page 655 08d lpn auto?negotiation next page ability register r page 655 09d gcon 1000base?t/100base?t2 control register r/w page 656 10d gstatus 1000base?t/100base?t2 status register r page 656 15d estatus extended status register r page 657 16d pconf port configuration register r/w page 657 17d pstat port status 1 register ro page 659 18d pcont port control register ro page 660 19d link link health register ro page 661 20d pfifo 1000base?t fifo register r/w page 662 21d chan channel quality register ro page 662 25d phy power management r/w page 662 26d special gigabit disable register r/w page 663 27d misc. control register 1 r/w page 663 28d misc. control register 2 ro page 664 31d page select core register wo page 664 table 8-26. table of phyreg registers (continued) offset abbreviation name rw link to page
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 650 8.25.1 phy control register - pctrl (00d; r/w) field bit(s) description mode default reserved 5:0 reserved always read as 0b. write to 0b for normal operation rw always 000000b speed selection 1000 mb/ s (msb) 6 speed selection is determined by bits 6 (msb) and 13 (lsb) as follows. 11b = reserved 10b = 1000 mb/s 01b = 100 mb/s 00b = 10 mb/s a write to these bits do not take effect until a software reset is asserted, restart auto-negotiation is asserted, or power down transitions from power down to normal operation. note: if auto-negotiation is enabled, this bit is ignored. r/w 00b collision test 7 1b = enable col signal test. 0b = disable col signal test. note: this bit is ignored unless loopback is enabled (bit 14 = 1b). r/w 0b duplex mode 8 1b = full duplex. 0b = half duplex. note: if auto-negotiation is enabled, this bit is ignored. r/w 1b restart auto-negotiation 9 1b = restart auto-negotiation process. 0b = normal operation. auto-negotiation automatically restarts after hardware or software reset regardless of whether or not the restart bit is set. wo, sc 0b isolate 10 this bit has no effect on phy functionality. program to 0b for future compatibility. r/w 0b power down 11 1b = power down. 0b = normal operation. when using this bit, phy default configuration is lost and is not loaded from the eeprom after de-asserting the power down bit. note: after this bit is set, all indications from phy including link status are invalid. r/w 0b auto-negotiation enable 12 1b = enable auto-negotiation process. 0b = disable auto-negotiation process. this bit must be enabled for 1000base-t operation. r/w 1b
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 651 8.25.2 phy status register - pstatus (01d; r) speed selection (lsb) 13 see speed selection (msb), bit 6. note: if auto-negotiation is enabled, this bit is ignored. r/w 1b loopback 14 1b = enable loopback. 0b = disable loopback. r/w 0b reset 15 1b = phy reset. 0b = normal operation. note: when using phy reset, the phy default configuration is not loaded from the eeprom. the preferred way to reset the 82576 phy is using the ctrl.phy_rst field. wo, sc 0b field bit(s) description mode default extended capability 0 1b = extended register capabilities. ro 1b jabber detect 1 1b = jabber condition detected. 0b = jabber condition not detected. ro lh 0b link status 2 1b = link is up. 0b = link is down. ro, ll 0b auto-negotiation ability 3 1b = phy able to perform auto-negotiation. 0b = phy is not able to perform auto-negotiation. ro 1b remote fault 4 1b = remote fault condition detected. 0b = remote fault condition not detected. ro lh 0b auto-negotiation complete 5 1b = auto-negotiation process complete. 0b = auto-negotiation process not complete. ro 0b mf preamble suppression 6 1b = phy accepts management frames with preamble suppressed. 0b = phy does not accept management frames with preamble suppressed. ro 0b reserved 7 reserved. ignore on reads. ro 0b extended status 8 1b = extended status information in the extended phy status register (15d). 0b = no extended status information in the extended phy status register (15d). ro 1b 100base-t2 half duplex 9 1b = phy able to perform half duplex 100base-t2 (not supported). 0b = phy not able to perform half duplex 100base-t2. ro 0b 100base-t2 full duplex 10 1b = phy able to perform full duplex 100base-t2 (not supported). 0b = phy not able to perform full duplex 100base-t2. ro 0b 10 mb/s half duplex 11 1b = phy able to perform half duplex 10base-t. 0b = phy not able to perform half duplex 10base-t. ro 1b field bit(s) description mode default
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 652 8.25.3 phy identifier register 1 (lsb) - phy id 1 (02d; r) 8.25.4 phy identifier register 2 (msb) - phy id 2 (03d; r) 8.25.5 auto?negotiation advertisement register - ana (04d; r/ w) 10 mb/s full duplex 12 1b = phy able to perform full duplex 10base-t. 0b = phy not able to perform full duplex 10base-t. ro 1b 100base-x half duplex 13 1b = phy able to perform half duplex 100base-x. 0b = phy not able to perform half duplex 100base-x. ro 1b 100base-x full duplex 14 1b = phy able to perform full duplex 100base-x. 0b = phy not able to perform full duplex 100base-x. ro 1b 100base-t4 15 1b = phy able to perform 100base-t4. 0b = phy not able to perform 100base-t4. ro 0b field bit(s) description mode default phy id number 15:0 the phy identifier composed of bits 3 through 18 of the organizationally unique identifier (oui). ro 0x02a8 field bit(s) description mode default manufacturer?s revision number 3:0 4 bits containing the manufacturer?s revision number. ro 0x1 manufacturer?s model number 9:4 6 bits containing the manufacturer?s part number. ro 0x39 phy id number 15:10 the phy identifier composed of bits 19 through 24 of the oui. ro 0x00 field bit(s) description mode default selector field 4:0 00001b = 802.3. other combinations are reserved. unspecified or reserved combinations should not be transmitted. note: setting this field to a value other than 00001b can cause auto negotiation to fail. r/w 00001b 10base-t 5 1b = dte is 10base-t capable. 0b = dte is not 10base-t capable. r/w 1b 10base-t full duplex 6 1b = dte is 10base-t full duplex capable. 0b = dte is not 10base-t full duplex capable. r/w 1b field bit(s) description mode default
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 653 8.25.6 auto?negotiation base page ability register - (05d; r) 100base-tx 7 1b = dte is 100base-tx capable. 0b = dte is not 100base-tx capable. r/w 1b 1 100base-tx full duplex 8 1b = dte is 100base-tx full duplex capable. 0b = dte is not 100base-tx full duplex capable. r/w 1b 1 100base-t4 9 1b = capable of 100base-t4 (not supported). 0b = not capable of 100base-t4. r/w 0b pause 10 advertise to partner that pause operation (as defined in 802.3x) is desired. r/w 1b asm_dir 11 advertise asymmetric pause direction bit. this bit is used in conjunction with pause. r/w 1b reserved 12 always read as 0b. write to 0b for normal operation. r/w 0b remote fault 13 1b = set remote fault bit. 0b = do not set remote fault bit. r/w 0b reserved 14 always read as 0b. write to 0b for normal operation. r/w 0b next page 15 1b = manual control of next page (software). 0b = the 82576 control of next page (auto). r/w 0b 1. if eeprom adv10lu (word 0x21, bit 3) is asserted, then the default is set to 0b; otherwise, the default is 1b. field bit(s) description mode default selector fields[4:0] 4:0 <00001> = ieee 802.3 other combinations are reserved. unspecified or reserved combinations must not be transmitted. if field does not match phy register 04d, bits 4:0, the an process does not complete and no hcd is selected. ro n/a 10base-t 5 1b = link partner is 10base-t capable. 0b = link partner is not 10base-t capable. ro n/a 10base-t full duplex 6 1b = link partner is 10base-t full duplex capable. 0b = link partner is not 10base-t full duplex capable. ro n/a 100base-tx 7 1b = link partner is 100base-tx capable. 0b = link partner is not 100base-tx capable. ro n/a 100base-tx full duplex 8 1b = link partner is 100base-tx full duplex capable. 0b = link partner is not 100base-tx full duplex capable. ro n/a 100base-t4 9 1b = link partner is 100base-t4 capable. 0b = link partner is not 100base-t4 capable. ro n/a lp pause 10 link partner uses pause operation as defined in 802.3x. ro n/a field bit(s) description mode default
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 654 8.25.7 auto?negotiation expansion register - ane (06d; r) lp asm_dir 11 asymmetric pause direction bit. 1b = link partner is capable of asymmetric pause. 0b = link partner is not capable of asymmetric pause. ro n/a reserved 12 always read as 0b. write as 0b. ro 0b remote fault 13 1b = remote fault. 0b = no remote fault. ro n/a acknowledge 14 1b = link partner has received link code word from the phy. 0b = link partner has not received link code word from the phy. ro n/a next page 15 1b = link partner has ability to send multiple pages. 0b = link partner has no ability to send multiple pages. ro n/a field bit(s) description mode default link partner auto- negotiation able 0 1b = link partner is auto-negotiation able. 0b = link partner is not auto-negotiation able. ro 0b page received 1 indicates that a new page has been received and the received code word has been loaded into phy register 05d (base pages) or phy register 08d (next pages) as specified in clause 28 of 802.3. this bit clears on read. if phy register 16d bit 1 (alternate np feature) is set, the page received bit also clears when mr_page_rx = false or transmit_disable = true. ro/ lh 0b next page able 2 1b = local device is next page able. 0b = local device is not next page able. ro 1b link partner next page able 3 1b = link partner is next page able. 0b = link partner is not next page able. ro 0b parallel detection fault 4 1b = parallel detection fault has occurred. 0b = parallel detection fault has not occurred. ro/ lh 0b base page 5 this bit indicates the status of the auto-negotiation variable, base page. if flags synchronization with the auto- negotiation state diagram enabling detection of interrupted links. this bit is only used if phy register 16d, bit 1 (alternate np feature) is set. 1b = base_page = true. 0b = base_page = false. ro/ lh 0b reserved 15:6 always read as 0b. ro 0x0 field bit(s) description mode default
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 655 8.25.8 auto?negotiation next page transmit register - npt (07d; r/w) 8.25.9 auto?negotiation next page ability register - lpn (08d; r) field bit(s) description mode default message/unformatted field 10:0 11-bit message code field. r/w 0x1 toggle 11 1b = previous value of the transmitted link code word = 0b. 0b = previous value of the transmitted link code word = 1b. ro 0b acknowledge 2 12 1b = complies with message. 0b = cannot comply with message. r/w 0b message page 13 1b = message page. 0b = unformatted page. r/w 1b reserved 14 always read as 0b. write to 0b for normal operation. ro 0b next page 15 1b = additional next pages follow. 0b = last page. r/w 0b bit(s) field description mode default 10:0 message/unformatted field 11-bit message code field. ro 0x0 11 toggle 1b = previous value of the transmitted link code word = 0b. 0b = previous value of the transmitted link code word = 1b. ro 0b 12 acknowledge 2 1b = link partner complies with the message. 0b = link partner cannot comply with the message. ro 0b 13 message page 1b = page sent by the link partner is a message page. 0b = page sent by the link partner is an unformatted page. ro 0b 14 acknowledge 1b = link partner has received link code word from the phy. 0b = link partner has not received link code word from the phy. ro 0b 15 next page 1b = link partner has additional next pages to send. 0b = link partner has no additional next pages to send. ro 0b
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 656 8.25.10 1000base?t/100base?t2 control register - gcon (09d; r/w) 8.25.11 1000base?t/100base?t2 status register - gstatus (10d; r) bit(s) field description mode default 7:0 reserved always read as 0b. write to 0b for normal operation. r/w 0b 8 1000base-t half duplex 1b = dte is 1000base-t capable. 0b = dte is not 1000base-t capable. this bit is used by smart negotiation. r/w 0b 9 1 1. the default of this bit is affected by the eeprom bit configuration of the 82576the 82575. if eepprom bit an-1000dis is asserted, then the default is set to 0b. if eepprom bit adv10lu (word 0x21, bit 3) is asserted, then the default is set to 0b 1000base-t full duplex 1b = dte is 1000base-t full duplex capable. 0b = dte is not 1000base-t full duplex capable. this bit is used by smart negotiation. r/w 1b 10 port type 1b = prefer multi-port device (master). 0b = prefer single port device (slave). this bit is only used when phy register 9, bit 12 is set to 0b. r/w 0b 11 master/slave config value 1b = configure phy as master during master-slave negotiation (only when phy register 9, bit 12 is set to 1b. 0b = configure phy as slave during master-slave negotiation (only when phy register 9, bit 12 is set to 1b. r/w 0b 12 master/slave config enable 1b = manual master/slave configuration. 0b = automatic master/slave configuration. r/w 0b 15:13 test mode 000b = normal mode. 001b = pulse and droop template. 010b = jitter template. 011b = jitter template. 100b = distortion packet. 101b, 110b, 111b = reserved. r/w 000b field bit(s) description mode idle error count 7:0 idle error counter value. this register counts the number of invalid idle codes when link is high and the phy is in either 1000base-t or 100base-t modes. if there is an overflow, these bits are all held at 1b. they are cleared on read or a hard or soft reset. ro, lh reserved 9:8 reserved. always set to 00b. ro lp 1000t hd 10 1b = link partner is capable of 1000base-t half duplex. 0b = link partner is not capable of 1000base-t half duplex. value in bit 10 are not valid until the ane register page received bit equals 1b. ro
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 657 8.25.12 extended status register - estatus (15d; r) 8.25.13 port configuration register - pconf (16d; r/w) lp 1000t fd 11 1b = link partner is capable of 1000base-t full duplex. 0b = link partner is not capable of 1000base-t full duplex. value in bit 11 are not valid until the ane register page received bit equals 1b. ro remote receiver status 12 1b = remote receiver ok. 0 b = remote receiver not ok. ro local receiver status 13 1b = local receiver ok. 0b = local receiver not ok. ro master/slave resolution 14 1b = local phy configuration resolved to master. 0b = local phy configuration resolved to slave. value in bits 14 are not valid until the ane register page received bit equals 1b. ro master/slave config fault 15 1b = master/slave configuration fault detected. 0b = no master/slave configuration fault detected. ro, lh field bit(s) description mode default reserved 11:0 reserved. always read as 0b. ro 0x0 1000base-t half duplex 12 1b = 1000base-t half duplex capable. 0b = not 1000base-t half duplex capable. ro 1b 1000base-t full duplex 13 1b = 1000base-t full duplex capable. 0b = not 1000base-t full duplex capable. ro 1b 1000base-x half duplex 14 1b =1000base-x half duplex capable. 0b = not 1000base-x half duplex capable. ro 0b 1000base-x full duplex 15 1b =1000base-x full duplex capable. 0b = not 1000base-x full duplex capable. ro 0b field bit(s) description mode default reserved 0 always read as 0b. write to 0b for normal operation. r/w 0b alternate np feature 1 1b = enable alternate auto-negotiate next page feature. 0b = disable alternate auto-negotiate next page feature. r/w 0b reserved 3:2 always read as 00b. write to 00b for normal operation. r/w 00b field bit(s) description mode
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 658 auto mdix parallel detect bypass 4 auto_mdix parallel detect bypass. bypasses the fix to ieee auto-mdix algorithm for the case where the phy is in forced-speed mode and the link partner is auto-negotiating. 1b = strict 802.3 auto-mdix algorithm. 0b = auto-mdix algorithm handles auto-negotiation disabled modes. this is accomplished by lengthening the auto-mdix switch timer before attempting to swap pairs on the first time out. r/w 0b pre_en 5 preamble enable 0b = set rx_dv high coincident with sfd. 1b = set rx_dv high and rxd = preamble (after crs is asserted). r/w 1b reserved 6 always read as 0b. write to 0b for normal operation. r/w 0b smart speed 7 1b = smart speed selection enabled. 0b = smart speed selection disabled. note: the default of this bit is determined by the eeprom speed bit (word 0x21, bit 5). r/w 0b tp loopback (10base- t) 8 1b = disable tp loopback during half-duplex operation. 0b = normal operation. r/w 1b reserved 9 always read as 0b. write to 0b for normal operation. r/w 0b jabber (10base-t) 10 1b = disable jabber. 0b = enable jabber. r/w 0b bypass 4b5b (100base-tx) 11 1b = bypass4b5b encoder and decoder. 0b = normal operation. r/w 0b bypass scramble (100base-tx) 12 1b = bypass scrambler and descrambler. 0b = normal operation. r/w 0b transmit disable 13 1b = disable twisted-pair transmitter. 0b = normal operation. r/w 0b link disable 14 1b = force link pass 0b = normal operation for 10base-t, this bit forces the link signals to be active. in 100base-t mode, setting this bit should force the link monitor into it?s linkgood state. for gigabit operation, this merely bypasses auto-negotiation?the link signals still correctly indicate the appropriate status. r/w 0b reserved 15 always read as 0b. write 0b for normal operation. r/w 0b field bit(s) description mode default
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 659 8.25.14 port status 1 register - pstat (17d; ro) field bit(s) description mode default lfit indicator 0 status bit indicating the auto-negotiation link fail inhibit timer has expired. this indicates that the auto-negotiation process completed page exchanges but was unable to bring up the selected mau?s link. 1b = auto-negotiation has aborted link establishment following normal page exchange. 0b = auto-negotiation has either completed normally, or is still in progress. this bit is cleared when read or when one of the following occurs: ? link comes up (phy register 17d, bit 10 = 1b). ? auto-negotiation is disabled (phy register 00d, bit 12 = 0b). ? auto-negotiation is restarted (phy register 00d, bit 9 = 1b). ro/ lh/ sc 0b polarity status 1 1b = 10base-t polarity is reversed. 0b = 10base-t polarity is normal. ro 0b reserved 6:2 ignore these bits. ro 0b reserved 8:7 reserved ignore these bits. ro x duplex mode 9 1b = full duplex. 0b = half duplex. ro 0b link 10 indicates the current status of the link. differs from phy register 01, bit 2 in that this bit changes anytime the link status changes. phy register 01, bit 2 latches low and stays low until read regardless of link status. 1b = link is currently up. 0b = link is currently down. ro 0b mdi-x status 11 status indicator of the current mdi/mdi-x state of the twisted pair interface. this status bit is valid regardless of the mau selected. 1b = phy has selected mdi-x (crossed over). 0b = phy has selected mdi (not crossed over). ro 0b receive status 12 1b = phy currently receiving a packet. 0b = phy receiver is idle. when in internal loopback, this bit reads as 0b. ro 0b transmit status 13 1b = phy currently transmitting a packet. 0b = phy transmitter is idle. when in internal loopback, this bit reads as 0b. ro 0b data rate 15:14 00b = reserved. 01b = phy operating in 10base-t mode. 10b = phy operating in 100base-tx mode. 11b = phy operating in 1000base-t mode. ro 00b
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 660 8.25.15 port control register - pcont (18d; r/w) field bit(s) description mode default reserved 3:0 always read as 0b. write to 0b for normal operation. r/w 0x0 tp loopback 4 allow gigabit loopback on twisted pairs. r/w 0b extend spd delay 5 when set, extends the delay of a power down if the ethernet cable is disconnected. 0b = wait four seconds before beginning power down. 1b = wait 6.3 seconds before beginning power down. r/w 0b reserved 8:6 always read as 0b. write to 0b for normal operation. r/w 0x0 non-compliant scrambler compensation 9 1b = detect and correct for non-compliant scrambler. 0b = detect and report non-compliant scrambler. note: the default of this bit is affected by the eeprom bit configurations of the 82576. if eeprom word 0x21, bit 2 is asserted, then the default is set to 1b. r/w 0b ten_crs_select 10 1b = extend crs to cover 1000base-t latency and rx_dv. 0b = do not extend crs (rx_dv can continue past crs). r/w 1b flip_chip 11 used for applications where the core or application is mirror- imaged. channel d acts like channel a with t10pol_inv set and vice-versa. channel c acts like channel b with t10pol_inv set and vice-versa. this forces the correctness of all mdi/ mdix and polarity issues. r/w 0b auto-mdi-x 12 auto-mdi-x algorithm enable. 1b = enable auto-mdi-x mode. 0b = disable auto-mdi-x mode (manual mode). note: when forcing speed to 10base-t or 100base-t, use manual mode. clear the bit and set phy register 18d, bit 13 according to the required mdi-x mode. r/w 1b mdi-x mode 13 force mdi-x mode. valid only when operating in manual mode. (phy register 18d, bit 12 = 0b. 1b = mdi-x (cross over). 0b = mdi (no cross over). r/w 0b reserved 14 always read as 0b. write to 0b for normal operation. r/w 0b jitter test clock 15 this configuration bit is used to enable the 82576 to drive its differential transmit clock out through the appropriate analog test (atest+/-) output pads. this feature is required in order to demonstrate conformance to the ieee clause 40 jitter specification. when high, it sends jitter test clock out. this bit works in conjunction with internal/external phy register 0x4011, bit 15. in order to have the clock probed out, it is required to perform the following write sequence: phy register 18d, bit 15 = 1b phy register 31d = 4010h (page select) phy register 17d = 0080h phy register 31d = 0000h (page select) r/w 0b
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 661 8.25.16 link health register - link (19d; ro) field bit(s) description mode hw rst valid channel a 0 the channel a dsp had converged to incoming data. ro 0b valid channel b 1 the channel b dsp had converged to incoming data. ro 0b valid channel c 2 the channel c dsp had converged to incoming data. ro 0b valid channel d 3 the channel d dsp had converged to incoming data. if an_enable is true, valid_chan_a = dsplocka latched on the rising edge of link_fail_inhibit_timer_done and link = 0b. if an_enable is false, valid_chan_ a = dsplocka. ro 0b auto-negotiation active 4 auto-negotiate is actively deciding hcd. ro 0b reserved 5 always read as 0b. ro 0b auto-negotiation fault 6 auto-negotiate fault: this is the logical or of phy register 01d, bit 4, phy register 06d, bit 4, and phy register 10d, bit 15. ro 0b reserved 7 always read as 0b. ro 0b data err[0] 8 mode: 10: 10 mb/s polarity error. 100: symbol error. 1000: gig idle error. ro/ lh x data err[1] 9 mode: 10: n/a. 100: scrambler unlocked. 1000: local receiver not ok. ro/ lh x count overflow 10 32 idle error events were counted in less than 1 ms. ro/ lh 0b gigabit rem rcvr nok 11 gig has detected a remote receiver status error. this is a latched high version of phy register 10d, bit 12. ro/ lh 0b gigabit master resolution 12 gig has resolved to master. this is a duplicate of phy register 10d, bit 14. programmers must read phy register 10d, bit 14 to clear this bit. ro 0b gigabit master fault 13 a fault has occurred with the gig master/slave resolution process. this is a copy of phy register 10, bit 15. programmers must read phy register 10, bit 15 to clear this bit. ro 0b gigabit scrambler error 14 1b indicates that the phy has detected gigabit connection errors that are most likely due to a non-ieee compliant scrambler in the link partner. 0b = normal scrambled data. definition is: if an_enable is true and in gigabit mode, on the rising edge of internal signal link_fail_inibit timer_done, the dsp_lock is true but loc_rcvr_ok is false. ro 0b ss downgrade 15 smart speed has downgraded the link speed from the maximum advertised. ro/ lh 0b
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 662 8.25.17 1000base?t fifo register - pfifo (20d; r/w) 8.25.18 channel quality register - chan (21d; ro) 8.25.19 phy power management - (25d; r/w) field bit(s) description mode default buffer size 3:0 an unsigned integer that stipulates the number of write clocks to delay the read controller after internal 1000base- t?s tx_en is first asserted. this buffer protects from underflow at the expense of latency. the maximum value that can be set is 13d or 0xd. r/w 0101b reserved 7:4 must be set to 0100 for normal operation. r/w 0100b fifo out steering 9:8 00b, 01b: enable the output data bus from 1000base-t fifo to transmitters, drives zeros on the output loop-back bus from 1000base-t fifo to external application and to dsp rx-fifos in test mode. 10b: drive zeros on output bus from 1000base-t fifo to transmitters, enable data on the output loop-back bus from 1000base-t fifo to external application and to dsp rx-fifos in test mode. 11b: enable the output data bus from 1000base-t fifo to both transmitters and loop-back bus. r/w 00b disable error out 10 when set, disables the addition of under/overflow errors to the output data stream on internal 1000base-t?s tx_error. r/w 0b reserved 13:11 always read as 0b. write to 0b for normal operation. r/w 0x0 fifo overflow 14 status bit set when read clock that is slower than internal 1000base-t?s gtx_clk has allowed the fifo to fill to capacity mid packet. decrease buffer size. ro/ lh 0b fifo underflow 15 status bit set when read clock that is faster than internal 1000base-t?s gtx_clk empties the fifo mid packet. increase the buffer size. ro/ lh 0b field bit(s) description mode default mse_a 3:0 the converged mean square error for channel a. ro 0x0 mse_b 7:4 the converged mean square error for channel b. ro 0x0 mse_c 11:8 the converged mean square error for channel c. ro 0x0 mse d 15:12 the converged mean square error for channel d. this field is only meaningful in gigabit, or in 100base-tx if this is the receive pair. use of this field is complex and needs interpretation based on the chosen threshold value. ro 0x0 field bit(s) description mode default reserved 15:9 always read as 0b. write to 0b for normal operation. r/w 0x0 rst_compl 8 indicates phy internal reset cleared. lh 0b
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 663 note: part of the default values of this register can be changed by eeprom settings. 8.25.20 special gigabit disable register - (26d; r/w) 8.25.21 misc. control register 1 - (27d; r/w) spd_b2b_en 7 spd back-to-back enable. r/w 1b disable 1000 6 when set, disables 1000 mb/s in all power modes. note that this bit can be loaded from eeprom. r/w 0b go link disconnect 5 setting this bit will cause the phy to enter link disconnect mode immediately. r/w 0b link energy detect 4 this bit is set when the phy detects energy on the link. note that this bit is valid only if an enabled (phy register 00b, bit 12) and spd_en is enabled (phy register 25d, bit 0). r/w 0b disable 1000 nd0a 3 disables 1000 mb/s operation in non-d0a states. note that this bit can be loaded from eeprom. r/w 1b lplu 2 low power on link up when set, enables the decrease in link speed while in non- d0a states when the power policy and power management state specify it. note: bit can be loaded from eeprom. if this bit is loaded from eeprom, it is reset to the eeprom value after each fundamental reset. r/w 1b d0lplu 1 d0 low power link up when set, configures the phy to negotiate for a low speed link while in d0a state. r/w 0b spd_en 0 smart power down when set, enables phy smart power down mode. note that bit can be loaded from eeprom. r/w 1b field bit(s) description mode default reserved 15:0 always read as 0b. write to 0b for normal operation. r/w 0x0 field bit(s) description mode default reserved 15 ignore this bit. r/w 0b reserved 14 always read as 0b. write to 0b for normal operation. ro 0b reserved 13 must be 1 for normal operation. r/w 1b duplex_manual_set 12 when set, the 82576 sets the duplex according to the duplex mode bit in register 0. this bit is cleared following auto-negotiation. r/w 0b reserved 11:9 always read as 0b. write to 0b for normal operation. ro 0x0
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 664 8.25.22 misc. control register 2 - (28d; ro) note: bits 13:8 might differ from the corresponding bits in phy register 04d and 09d due to non- ieee phy features (lplu, an1000_dis, and smart-speed). 8.25.23 page select core register - (31d; wo) ss_cfg_cntr 8:6 smart speed counter configuration: 1-5 (001b:101b). r/w 010b t10_auto_pol_dis 5 when set, disables the auto-polarity mechanism in the 10 block. r/w 0b reserved 4:0 always read as 0b. write to 0b for normal operation. r/w 0x0 field bit(s) description mode default reserved 15:14 always read as 0b. write to 0b for normal operation. r/w 0x0 act_an_adv_gigfdx 13 indicates the actual an advertisement of the phy for 1000 full-duplex capability. 0b = not 1000 full duplex capable. 1b = 1000 full duplex capable. ro 0b act_an_adv_gighdx 12 indicates the actual an advertisement of the phy for 1000 half-duplex capability. 0b = not 1000 half duplex capable. 1b = 1000 half duplex capable. ro 0b act_an_adv_100fd 11 indicates the actual an advertisement of the phy for 100 full-duplex capability. 0b = not 100 full duplex capable. 1b = 100 full duplex capable. ro 0b act_an_adv_100hd 10 indicates the actual an advertisement of the phy for 100 half-duplex capability. 0b = not 100 half duplex capable. 1b = 100 half duplex capable. ro 0b act_an_adv_10fdx 9 indicates the actual an advertisement of the phy for 10 full- duplex capability. 0b = not 10 full duplex capable. 1b = 10 full duplex capable. ro 0b act_an_adv_10hdx 8 indicates the actual an advertisement of the phy for 10 half- duplex capability. 0b = not 10 half duplex capable. 1b = 10 half duplex capable. ro 0b reserved 7:0 reserved. r/w 0x0 field bit(s) description mode default page_sel 15:0 this register is used to swap out the base page containing the ieee registers for intel reserved test and debug pages residing within the extended address space. wo 0x0
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 665 8.26 virtual function device registers 8.26.1 queues registers each vf has two queues ? q0 and q1. these queues are also used by the pf when working in non-iov mode or if not all vfs are allocated to vms. the mapping between the virtual queue number (vqn) and the physical queue number (pqn) is given by the equation: pqn = vfn + vqn*8 (where vfn is the vf number). for example: q0 and q1 of vf0 are actually q0 and q8, q0 and q1 of vf1 are actually q1 and q9, etc. the virtual address of q0 and q1 registers is always the same (rx: 0x2800 and 0x2880, tx: 0x3800 and 0x3880) ? like the physical q0 and q1 in the 82575 aliased area. 8.26.2 non-queue registers non-queue registers get a virtual address that are equal to the same registers that belong to the pf. these registers are mapped to the physical address space at 0x10000, where each vm gets 0x100 bytes for its registers: ? vf0 registers: 0x10000 ? 0x100ff ? vf1 registers: 0x10100 ? 0x101ff ?? 8.26.2.1 eitr registers the 82576 supports 25 eitr registers. in non iov mode, all the eitr registers can be used by the pf. in iov mode, 3 eitr registers are allocated to each vf and the pf should only use the remaining eitr registers. eitr0-2 registers are accessed by the vfs at addresses 0x1680 - 0x1688 and matches the pf eitrs according to the following table. eitr0 is always allocated to the pf. 8.26.2.2 msi-x registers the msi-x vectors of each vf are reflected in it?s bar3. the pba bits of the vfs are not replicated in the pf. vf pf eitr physical address 0 eitr22 - eitr24 0x16d8 - 0x16e0 1 eitr19 - eitr21 0x16cc - 0x16d4 2 eitr16 - eitr18 0x16c0 - 0x16c8 ... 7 eitr1 - eitr3 0x1684 - 0x168c
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 666 8.26.3 register set - csr bar virtual address physical address base abbreviation name 0x0000/0x0004 0x10000 + vfn * 0x100 vtctrl control (only rst bit) 0x0008 0x0008 (common ? ro) vtstatus status (mirror of pf status register). 0x1048 0x1048 (common - ro) vtfrtimer free running timer (mirror of pf timer). 0x1520 1 n/a vteics extended interrupt cause set register 0x1524 n/a vteims extended interrupt mask set/read register 0x1528 n/a vteimc extended interrupt mask clear register 0x152c 0x1002c + vfn * 0x100 n/a vteiac extended interrupt auto clear register 0x1530 n/a vteiam extended interrupt auto mask enable register 0x1580 n/a vteicr extended interrupt cause set register 0x1680 ? 0x1688 0x16d8 - 0x16e0 - vfn * 0xc eitr 0-2 interrupt throttle registers 0-2 0x1700 0x1700 + vfn * 4 vtivar0 interrupt vector allocation register queues 0x1740 0x1720 + vfn * 4 vtivar_misc interrupt vector allocation register misc. 0x0f04 0x5b68 pbacl pba clear 0x0f0c 0x5480 + vfn * 0x4 psrtype replication packet split receive type 0x0c40 0x0c40 + vfn*0x4 vfmailbox virtual function mailbox 0x0800 ? 0x083f 0x0800 ? 0x083f + vfn * 0x40 vmbmem virtualization mail box memory 0x2800+n*0x100 n=0,1 0xc000, 0xc200 +vfn * 0x40 rdbal0/1 receive descriptor base address low 0/1 0x2804+n*0x100 n=0,1 0xc004, 0xc204 +vfn * 0x40 rdbah0/1 receive descriptor base address high 0/1 0x2808+n*0x100 n=0,1 0xc008, 0xc208 +vfn * 0x40 rdlen0/1 receive descriptor length 0/1 0x280c+n*0x100 n=0,1 0xc00c, 0xc20c +vfn * 0x40 srrctl0/1 split and replication receive control register 0/1 0x2810+n*0x100 n=0,1 0xc010, 0xc210 +vfn * 0x40 rdh0/1 receive descriptor head 0/1 0x2814+n*0x100 n=0,1 0xc014, 0xc214 +vfn * 0x40 rxctl0/1 rx dca control registers 0/1 0x2818+n*0x100 n=0,1 0xc018, 0xc218 +vfn * 0x40 rdt0/1 receive descriptor tail 0/1
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 667 0x2828+n*0x100 n=0,1 0xc028, 0xc228 +vfn * 0x40 rxdctl0/1 receive descriptor control 0/1 0x2830+n*0x100 n=0,1 0xc030, 0xc230 +vfn * 0x40 rqdpc0/1 receive queue drop packet count 0/1 0x3800+n*0x100 n=0,1 0xe000, 0xe200 +vfn * 0x40 tdbal0/1 transmit descriptor base address low 0/1 0x3804+n*0x100 n=0,1 0xe004, 0xe204 +vfn * 0x40 tdbah0/1 transmit descriptor base address high 0/1 0x3808+n*0x100 n=0,1 0xe008, 0xe208 +vfn * 0x40 tdlen0/1 transmit descriptor ring length 0/1 0x3810+n*0x100 n=0,1 0xe010, 0xe210 +vfn * 0x40 tdh0/1 transmit descriptor head 0/1 0x3814+n*0x100 n=0,1 0xe014, 0xe214 +vfn * 0x40 txctl0/1 tx dca control registers 0/1 0x3818+n*0x100 n=0,1 0xe018, 0xe218 +vfn * 0x40 tdt0/1 transmit descriptor tail 0/1 0x3828+n*0x100 n=0,1 0xe028, 0xe228 +vfn * 0x40 txdctl0/1 transmit descriptor control 0/1 0x3838+n*0x100 n=0,1 0xe038, 0xe238 +vfn * 0x40 tdwbal0/1 tx descriptor completion writeback address low 0/1 0x383c+n*0x100 n=0,1 0xe03c, 0xe23c +vfn * 0x40 twbah0/1 tx descriptor completion writeback address high 0/1 0x0f10 2 0x10010 + vfn * 0x100 vfgprc good packets received count 0x0f14 0x10014 + vfn * 0x100 vfgptc good packets transmitted count 0x0f18 0x10018 + vfn * 0x100 vfgorc good octets received count 0x0f34 0x10034 + vfn * 0x100 vfgotc good octets transmitted count 0xf3c 0x1003c + vfn * 0x100 vfmprc multicast packets received count 0x0f40 0x10040 + vfn * 0x100 vfgprlbc good rx packets loopback count 0x0f44 0x10044 + vfn * 0x100 vfgptlbc good tx packets loopback count 0x0f48 0x10048 + vfn * 0x100 vfgorlbc good rx octets loopback count 0x0f50 0x10050 + vfn * 0x100 vfgotlbc good tx octets loopback count 0x34e8 0x34e8 pbtwac tx packet buffer wrap around counter 0x24e8 0x24e8 pbrwac rx packet buffer wrap around counter 0x30e8 0x30e8 pbswac switch packet buffer wrap around counter
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 668 8.26.4 register set - msi-x bar 8.27 virtual function register descriptions all the registers in this section are replicated per vf. the addresses are relative to the beginning of each vf address space. the address relative to bar0 as programmed in the iov structure in the pf configuration space (offset 0x180-0x184) can be found by the following formula: vf bar0 + max(16k, system page size)* vf# + csr offset. see section 8.26.3 for the list of registers exposed to the vf detailed below. 8.27.1 vt control register - vtctrl (0x0000; rw) 8.27.2 vf status register - status (0x00008; ro) this register is a mirror of the pf status register. see table for details of this register. 1. vteics, vteims, vteimc, vteiac, vteiam, vteicr: vf interrupt bits can also be accessed using the pf interrupt registers -- se e section 8.8, interrupt register descriptions , bit i of vf v maps to bit (25 - (v+1)*3 +i) in the pf register. 2. bold addresses indicate registers whose virtual addresses are different from their physical addresses due to the need to main tain a virtual address space of 16kbytes. virtual address physical address base (+ vfn *0x30) abbreviation name 0x0000 - 0x0020 0x00010 msixtadd msix table entry lower address 0x0004 - 0x0024 0x00018 msixtuadd msix table entry upper address 0x0008 - 0x0028 0x00028 msixtmsg msix table entry message 0x000c - 0x002c n/a msixtvctrl msix table vector control max(page size, 0x2000) n/a msixpba msi-x pending bit array field bit(s) initial value description reserved 25:0 0x0 reserved. rst 26 0b vf reset this bit performs a reset of the queue enable and the interrupt registers of the vf. reserved 31:27 0x0 reserved reserved 31 0 reserved should be written with 0 to ensure future compatibility. read as 0.
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 669 8.27.3 vt free running timer - vtfrtimer (0x01048; ro) this register reflects the value of a free running timer that can be used for various timeout indications. the register is reset by a pci reset and/or software reset. this register is a mirror of the pf register. see description of this register in section 8.16.3 . 8.27.4 vt extended interrupt cause - vteicr (0x01580; rc/w1c) see description of this register in section 8.8.1 . 8.27.5 vt extended interrupt cause set - vteics (0x01520; wo) see the description of this register in section 8.8.2 . 8.27.6 vt extended interrupt mask set/read - vteims (0x01524; rws) see the description of this register in section 8.8.3 . 8.27.7 vt extended interrupt mask clear - vteimc (0x01528; wo) see the description of this register in section 8.8.4 . 8.27.8 vt extended interrupt auto clear - vteiac (0x0152c; r/ w) see the description of this register in section 8.8.5 . field bit(s) initial value description msix 2:0 0x0 indicates an interrupt cause mapped to msi-x vectors 2:0 reserved 31:3 0x0 reserved field bit(s) initial value description msix 2:0 0x0 sets to corresponding eicr bit of msi-x vectors 2:0 reserved 31:3 0x0 reserved field bit(s) initial value description msix 2:0 0x0 set mask bit for the corresponding eicr bit of msi-x vectors 2:0 reserved 31:3 0x0 reserved field bit(s) initial value description msix 2:0 0x0 clear mask bit for the corresponding eicr bit of msi-x vectors 2:0 reserved 31:3 0x0 reserved
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 670 8.27.9 vt extended interrupt auto mask enable - vteiam (0x01530; r/w) see the description of this register in section 8.8.6 . 8.27.10 vt interrupt throttle - vteitr (0x01680 + 4*n[n = 0...2]; r/w) see the description of this register in section 8.8.12 . 8.27.11 vt interrupt vector allocation registers - vtivar (0x01700; rw) these registers define the allocation of the two queue pairs interrupt causes as defined in table 7-44 to one of the msi-x vectors. each int_alloc[i] (i=0?3) field is a byte indexing an entry in the msi-x table structure and msi-x pba structure . field bit(s) initial value description msix 2:0 0x0 auto clear bit for the corresponding eicr bit of msi-x vectors 2:0 reserved 31:3 0x0 reserved field bit(s) initial value description msix 2:0 0x0 auto mask bit for the corresponding eicr bit of msi-x vectors 2:0 reserved 31:3 0x0 reserved field bit(s) initial value description int_alloc[0] 1:0 x defines the msi-x vector assigned to the interrupt cause associated with queue 0 rx. valid values are 0 to 2. reserved 6:2 0x0 reserved int_alloc_va l[0] 7 0b valid bit for int_alloc[0] int_alloc[1] 9:8 x defines the msi-x vector assigned to the interrupt cause associated with queue 0 tx. valid values are 0 to 2. reserved 14:10 0x0 reserved int_alloc_va l[1] 15 0b valid bit for int_alloc[1] int_alloc[2] 17:16 x defines the msi-x vector assigned to the interrupt cause associated with queue 1 rx. valid values are 0 to 2. reserved 22:18 0x0 reserved int_alloc_va l[2] 23 0b valid bit for int_alloc[2]
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 671 8.27.12 vt interrupt vector allocation registers - vtivar_misc (0x01740; rw) this register defines the msi-x vector allocated to the mailbox interrupt. a mailbox interrupt is asserted in the vf upon reception of a mailbox message or an acknowledge from the pf. it also asserted when the rsti bit rises. 8.27.13 msi?x table entry lower address - msixtadd (bar3: 0x0000 + 16*n [n=0...2]; r/w) see section 8.9.1 for information about this register. 8.27.14 msi?x table entry upper address - msixtuadd (bar3: 0x0004 + 16*n [n=0...2]; r/w) see section 8.9.2 for information about this register.. 8.27.15 msi?x table entry message - msixtmsg (bar3: 0x0008 + 16*n [n=0...2]; r/w) see section 8.9.3 for information about this register. 8.27.16 msi?x table entry vector control - msixtvctrl (bar3: 0x000c + 16*n [n=0...2]; r/w) see section 8.9.4 for information about this register. int_alloc[3] 25:24 x defines the msi-x vector assigned to the interrupt cause associated with queue 1 tx. valid values are 0 to 2. reserved 30:26 0x0 reserved int_alloc_va l[3] 31 0b valid bit for int_alloc[3] field bit(s) initial value description int_alloc[4] 1:0 x defines the msi-x vector assigned to the interrupt cause associated with the mailbox. valid values are 0 to 2. reserved 6:2 0x0 reserved int_alloc_val [4] 7 0b valid bit for int_alloc[4] reserved 31:8 0x0 reserved field bit(s) initial value description
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 672 8.27.17 msixpba - msixpba (bar3: 0x02000; ro) note: if a page size larger than 8k is programmed in the iov structure, the address of the msix pba table moves to be page aligned. 8.27.18 msi?x pba clear - pbacl (0x00f04; r/w1c) 8.27.19 receive descriptor base address low - rdbal (0x02800 + 256*n [n=0...1];r/w) see section 8.10.5 for information about this register. 8.27.20 receive descriptor base address high - rdbah (0x02804 + 256*n [n=0...1]; r/w) see section 8.10.6 for information about this register. 8.27.21 receive descriptor ring length - rdlen (0x02808 + 256*n [n=0...1]; r/w) see section 8.10.7 for information about this register. 8.27.22 receive descriptor head - rdh (0x02810 + 256*n [n=0...1]; r/0) see section 8.10.8 for information about this register. field bit(s) initial value description pending bits 2:0 0x0 for each pending bit that is set, the function has a pending message for the associated msi-x table entry. pending bits that have no associated msi-x table entry are reserved. reserved 31:3 0x0 reserved field bit(s) initial value description penbit 2:0 0x0 msi-x pending bits clear writing a 1b to any bit clears the corresponding msixpba bit; writing a 0b has no effect. reading this register returns the pba vector. reserved 31:3 0x0 reserved
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 673 8.27.23 receive descriptor tail - rdt (0x02818 + 256*n [n=0...1]; r/w) see section 8.10.9 for information about this register. 8.27.24 receive descriptor control - rxdctl (0x02828 + 256*n [n=0...1]; r/w) see section 8.10.10 for information about this register. 8.27.25 split and replication receive control register queue - srrctl(0x0280c + 256*n [n=0...1]; r/w) see section 8.10.2 for information about this register. 8.27.26 receive queue drop packet count - rqdpc (0x2830 + 256*n [n=0...1]; rc) see section 8.10.11 for information about this register. 8.27.27 replication packet split receive type - psrtype (0x00f0c; r/w) see section 8.10.3 for information about this register. 8.27.28 transmit descriptor base address low - tdbal (0x3800 + 256*n [n=0...1]; r/w) see section 8.12.8 for information about this register. 8.27.29 transmit descriptor base address high - tdbah (0x03804 + 256*n [n=0...1]; r/w) see section 8.12.9 for information about this register. 8.27.30 transmit descriptor ring length - tdlen (0x03808 + 256*n [n=0...1]; r/w) see section 8.12.10 for information about this register. 8.27.31 transmit descriptor head - tdh (0x03810 + 256*n [n=0...1]; r/0) see section 8.12.11 for information about this register.
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 674 8.27.32 transmit descriptor tail - tdt (0x03818 + 256*n [n=0...1]; r/w) see section 8.12.12 for information about this register. 8.27.33 transmit descriptor control - txdctl (0x03828 + 256*n [n=0...1]; r/w) see section 8.12.13 for information about this register. 8.27.34 tx descriptor completion write?back address low - tdwbal (0x03838 + 256*n [n=0...1]; r/w) see table 8.12.14 for information about this register. 8.27.35 tx descriptor completion write?back address high - tdwbah (0x0383c + 256*n [n=0...1];r/w) see section 8.12.15 for information about this register. 8.27.36 rx dca control registers - rxctl (0x02814 + 256*n [n=0...1]; r/w) see section 8.13.1 for information about this register. for information about this register. 8.27.37 tx dca control registers - txctl (0x03814 + 256*n [n=0...1]; r/w) see section 8.13.2 for information about this register. 8.27.38 good packets received count - vfgprc (0x0f10; ro) this register counts the number of good packets received by the queues allocated to this vf of any legal length. this counter includes loopback packets or replications of multicast packets. unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. an flr to the vf may cause some inaccuracy in this counter. field bit(s) initial value description gprc 31:0 0x0 number of good packets received (of any length).
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 675 8.27.39 good packets transmitted count - vfgptc (0x0f14; ro) this register counts the number of good packets transmitted by the queues allocated to this vf. this counter includes loopback packets or packets latter dropped by the switch or the mac. unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. 8.27.40 good octets received count - vfgorc (0x0f18; ro) this register counts the number of good (no errors) octets received by this vf. this counter includes loopback packets or replications of multicast packets. this register includes bytes received in a packet from the field through the field, inclusive. only octets of packets that pass address filtering are counted in this register. unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. an flr to the vf may cause some inaccuracy in this counter. 8.27.41 good octets transmitted count - vfgotc (0x0f34; ro) this register counts the number of good (no errors) octets transmitted by the queues allocated to this vf. this register includes bytes transmitted in a packet from the field through the field, inclusive, including any padding added by the hardware. the vlan tag added by the hardware is counted as part of the packet. this register counts octets in successfully transmitted packets that are 64 or more bytes in length. this counter includes loopback packets or packets latter dropped by the switch or the mac. note: unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. field bit(s) initial value description gptc 31:0 0x0 number of good packets sent. field bit(s) initial value description gorc 31:0 0x0 number of good octets received field bit(s) initial value description gotc 31:0 0x0 number of good octets transmitted
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 676 8.27.42 multicast packets received count - vfmprc (0x0f3c; ro) this register counts the number of good (no errors) multicast packets received by a given vm. this register does not count multicast packets received that fail to pass address filtering nor does it count received flow control packets. this register only increments if receives are enabled. this register does not count packets counted by the missed packet count (mpc) register. note: unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. 8.27.43 good tx octets loopback count - vfgotlbc (0x0f50; ro) this register counts the number of good (no errors) octets transmitted by the queues allocated to this vf that where sent to local vf. this counter includes packets that are sent to the lan and to a local vm. this register includes bytes transmitted in a packet from the field through the field, inclusive, including any padding added by the hardware. the vlan tag added by the hardware is counted as part of the packet. note: unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. 8.27.44 good tx packets loopback count - vfgptlbc (0x0f44; ro) this register counts the number of good (no errors) packets transmitted by the queues allocated to this vf that where sent to local vf. this counter includes packets that are sent to the lan and to a local vm. note: unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. 8.27.45 good rx octets loopback count - vfgorlbc (0x0f48; ro) this register counts the number of good (no errors) octets received by the queues allocated to this vf that where sent from some local vfs. field bit(s) initial value description mprc 31:0 0x0 number of multicast packets received. field bit(s) initial value description gotlbc 31:0 0x0 number of good octets transmitted to loopback field bit(s) initial value description gptlbc 31:0 0x0 number of good packets transmitted to loopback
programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 677 note: unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. 8.27.46 good rx packets loopback count - vfgprlbc (0x0f40; ro) this register counts the number of good (no errors) packets received by the queues allocated to this vf that where sent from some local vfs. note: unlike some other statistics registers that are not allocated per vf, this register is not cleared on read. furthermore, the register continues to count from 0x0000 on stepping beyond 0xffff. 8.27.47 virtual function mailbox - vfmailbox (0x0c40; rw) see section 8.14.3 for information about this register. 8.27.48 virtualization mailbox memory - vmbmem (0x0800:0x083c; r/w) a 64 bytes mailbox memory for pf and vf driver communication. locations can be accessed as 32-bit or 64-bit words. see section 8.14.4 for information about this register. 8.27.49 tx packet buffer wrap around counter - pbtwac (0x34e8; ro) see section 8.3.4 for information about this register. 8.27.50 rx packet buffer wrap around counter - pbrwac (0x24e8; ro) see section 8.3.5 for information about this register. field bit(s) initial value description gorlbc 31:0 0x0 number of good octets received from loopback field bit(s) initial value description gprlbc 31:0 0x0 number of good packets received from loopback
intel ? 82576eb gbe controller ? programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 678 8.27.51 switch packet buffer wrap around counter - pbswac (0x30e8; ro) see section 8.3.6 for information about this register.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 679 9.0 pcie programming interface 9.1 pcie compatibility pcie is completely compatible with existing deployed pci software. to achieve this, pcie hardware implementations conform to the following requirements: ? all devices required to be supported by deployed pci software must be enumerable as part of a tree through pci device enumeration mechanisms. ? devices in their default operating state must conform to pci ordering and cache coherency rules from a software viewpoint. ? pcie devices must conform to pci power management specifications and must not require any register programming for pci-compatible power management beyond those available through pci power management capabilities registers. power management is expected to conform to a standard pci power management by existing pci bus drivers. ? pcie devices implement all registers required by the pci specification as well as the power management registers and capability pointers specified by the pci power management specification. in addition, pcie defines a pcie capability pointer to indicate support for pcie extensions and associated capabilities. the 82576 is a multi-function device with the following functions: lan0 and lan1 are shown in pci functions 0 and 1. the lan function sel field in eeprom word 0x21 (reflected in the factps register (0x5b30)) determines if lan0 appears in pci function 0 or pci function 1. lan1 appears in the complementary pci function. see section 4.3 for description of the functions mapping when part of the functions are disabled. all functions contain the following regions of the pci configuration space: ? mandatory pci configuration registers ? power management capabilities ? msi capabilities ? pcie extended capabilities table 9-1. intel? 82576 gbe controller functions ? function number ? function description disable options 0 or 1 lan 0 strapping option. 1 or 0 lan 1 strapping option/eeprom word 0x10, bit 11.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 680 9.2 configuration sharing among pci functions the 82576 contains a single physical pcie core interface. the 82576 is designed so that each of the logical lan devices appears as a distinct function. many of the fields of the pcie header space contain hardware default values that are either fixed or might be overridden using eeprom, but might not be independently specified for each logical lan device. the following fields are considered to be common to both lan devices: the following fields are implemented individually for each lan function: see section 9.7 for a description of the configuration space reflected to virtual functions. 9.3 register map 9.3.1 register attributes configuration registers are assigned one of the attributes described in the following table. table 9-2. common fields for lan devices vendor id fixed to 8086. revision the revision number of the 82576 is reflected identically for both lan functions. header type this field indicates if a device is single function or multifunction. the value reflected in this field is reflected identically for both lan functions, but the actual value reflected depends on lan disable configuration. see section 9.4.9 for details. subsystem id the subsystem id of the 82576 can be specified via eeprom, but only a single value can be specified. the value is reflected identically for both lan functions. subsystem vendor id the subsystem vendor id of the 82576 can be specified via eeprom, but only a single value can be specified. the value is reflected identically for both lan functions. cap_ptr, max latency, min grant these fields reflect fixed values that are constant values reflected for both lan functions. table 9-3. fields implemented differently in lan functions device id the device id reflected for each lan function can be independently specified via eeprom. command, status each lan function implements its own command/status registers. latency timer, cache line size each lan function implements these registers individually. the system should program these fields identically for each lan to ensure consistent behavior and performance of each function. memory bar, flash bar, io bar, msi-x bar, expansion rom bar, each lan function implements its own base address registers, enabling each function to claim its own address region(s). interrupt pin each lan function independently indicates which interrupt pin (inta# or intb#) is used by that function?s mac to signal system interrupts. the value for each lan function can be independently specified via eeprom, but only if both lan functions are enabled. class code different class code values (iscsi/lan) can be set for each function.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 681 the pci configuration registers map is listed in table 9-5 . refer to a detailed description for registers loaded from the eeprom at initialization time. note that initialization values of the configuration registers are marked in parenthesis. table 9-4. configuration registers rd/wr description ro read-only register: register bits are read-only and cannot be altered by software. rw read-write register: register bits are read-write and can be either set or reset. r/w1c read-only status, write-1-to-clear status register, writing a 0b to r/w1c bits has no effect. ros read-only register with sticky bits: register bits are read-only and cannot be altered by software. bits are not cleared by reset and can only be reset with the pwrgood signal. devices that consume aux power are not allowed to reset sticky bits when aux power consumption (either via aux power or pme enable) is enabled. rws read-write register: register bits are read-write and can be either set or reset by software to the desired state. bits are not cleared by reset and can only be reset with the pwrgood signal. devices that consume aux power are not allowed to reset sticky bits when aux power consumption (either via aux power or pme enable) is enabled. r/w1cs read-only status, write-1-to-clear status register: register bits indicate status when read, a set bit indicating a status event can be cleared by writing a 1b. writing a 0b to r/w1c bits has no effect. bits are not cleared by reset and can only be reset with the pwrgood signal. devices that consume aux power are not allowed to reset sticky bits when aux power consumption (either via aux power or pme enable) is enabled. hwinit hardware initialized: register bits are initialized by firmware or hardware mechanisms such as pin strapping or serial eeprom. bits are read-only after initialization and can only be reset (for write-once by firmware) with pwrgood signal. rsvdp reserved and preserved: reserved for future r/w implementations; software must preserve value read for writes to bits. rsvdz reserved and zero: reserved for future r/w1c implementations; software must use 0b for writes to bits.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 682 9.3.2 pcie configuration space summary table 9-5. pcie configuration registers map section byte offset byte 3 byte 2 byte 1 byte 0 mandatory pci register 0x0 device id vendor id 0x4 status register control register 0x8 class code (0x020000/0x010000) revision id 0xc bist (0x00) header type (0x0/ 0x80) latency timer cache line size (0x10) 0x10 base address register 0 0x14 base address register 1 0x18 base address register 2 0x1c base address register 3 0x20 base address register 4 0x24 base address register 5 0x28 cardbus cis pointer (0x0000) 0x2c subsystem device id subsystem vendor id 0x30 expansion rom base address 0x34 reserved cap ptr (0x40) 0x38 reserved 0x3c max latency (0x00) min grant (0x00) interrupt pin (0x01/ 0x02) interrupt line (0x00) power managemen t capability 0x40 power management capabilities next pointer (0x50) capability id (0x01) 0x44 data bridge support extensions power management control & status msi capability 0x50 message control (0x0080) next pointer (0x70) capability id (0x05) 0x54 message address 0x58 message upper address 0x5c reserved message data 0x60 mask bits 0x64 pending bits msi-x capability 0x70 message control (0x00090) next pointer (0xa0) capability id (0x11) 0x74 table offset 0x78 pba offset
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 683 pcie capability 0xa0 pcie capability register (0x0002) next pointer (0xe0) capability id (0x10) 0xa4 device capability 0xa8 device status device control 0xac link capability 0xb0 link status link control 0xb4 reserved 0xb8 reserved reserved 0xbc reserved 0xc0 reserved reserved 0xc4 device capability 2 0xc8 reserved device control 2 0xcc reserved 0xd0 reserved reserved 0xd4 reserved 0xd8 reserved reserved vpd capability 0xe0 vpd address next pointer (0x00) capability id (0x03) 0xe4 vpd data aer capability 0x100 next capability ptr. (0x140) version (0x1) aer capability id (0x0001) 0x104 uncorrectable error status 0x108 uncorrectable error mask 0x10c uncorrectable error severity 0x110 correctable error status 0x114 correctable error mask 0x118 advanced error capabilities and control register 0x11c: 0x128 header log serial id capability 0x140 next capability ptr. (0x150) version (0x1) serial id capability id (0x0003) 0x144 serial number register (lower dword) 0x148 serial number register (upper dword) ari capability 0x150 next capability ptr. (0x160) version (0x1) ari capability id (0x000e) 0x154 ari control register ari capabilities table 9-5. pcie configuration registers map (continued) section byte offset byte 3 byte 2 byte 1 byte 0
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 684 an explanation of registers is provided in sections that follow. 9.4 mandatory pci configuration registers 9.4.1 vendor id register (0x0; ro) this is a read-only register that has the same value for all pci functions. it identifies unique intel products with a value of 0x8086. 9.4.2 device id register (0x2; ro) this is a read-only register. this field identifies individual 82576 functions. it has the same default value for the two lan functions but can be auto-loaded from the eeprom during initialization with different value for each port. the following table describes the possible values according to the sku and functionality of each function. for the latest device id information, see the product specification update. sr-iov capability 0x160 next capability offset (0x0) version (0x1) iov capability id (0x0010) 0x164 sr iov capabilities 0x168 sr iov status sr iov control 0x16c totalvfs (ro) initial vf (ro) 0x170 reserved function dependency link (ro) num vf (rw) 0x174 vf stride (ro) first vf offset (ro) 0x178 vf device id reserved 0x17c supported page size (0x553) 0x180 system page size (rw) 0x184 vf bar0 - low (rw) 0x188 vf bar0 - high (rw) 0x18c vf bar2 (ro) 0x190 vf bar3 - low (rw) 0x194 vf bar3- high (rw) 0x198 vf bar5 (ro) 0x19c vf migration state array offset (ro) table 9-5. pcie configuration registers map (continued) section byte offset byte 3 byte 2 byte 1 byte 0
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 685 note: the dummy function device id is loaded from eeprom address 0x1d and is used according to the disable status of the function. it is applicable only to function 0. see section 6.2.6 for details. 9.4.3 command register (0x4; r/w) this is a read/write register. each function has its own command register. unless explicitly specified, functionality is the same in all functions. pci function default value eeprom address meaning lan 0 0x10c9 0x0d 0x10c9 - dual port 10/100/1000 mb/s ethernet controller, x4 pcie, copper. 0x10e6 -dual port 1000 mb/s ethernet controller, x4 pcie, fiber. 0x10e7 - dual port 10/100/1000 mb/s ethernet controller, x4 pcie, serdes. 0x10ca 0x26 0x10ca -virtual function of 10/100/1000 mb/s ethernet controller. 0x10a6 0x1d 0x10a6 - dummy function (see note). lan 1 0x10c9 0x11 0x10c9 - dual port 10/100/1000 mb/s ethernet controller, x4 pcie, copper. 0x10e6 -dual port 1000 mb/s ethernet controller, x4 pcie, fiber. 0x10e7 - dual port 10/100/1000 mb/s ethernet controller, x4 pcie, serdes. 0x10ca 0x26 0x10ca -virtual function of 10/100/1000 mb/s ethernet controller. 0x10a6 0x1d 0x10a6 - dummy function (see note). bit(s) r/w initial value description 0 r/w 0b i/o access enable. for lan functions this field is r/w. for dummy function this field is ro as zero. 1 r/w 0b memory access enable. for lan functions this field is r/w. for dummy function this field is ro as zero. 2 r/w 0b bus master enable (bme). for lan functions this field is r/w. for dummy function this field is ro as zero. 3 ro 0b special cycle monitoring. hardwired to 0b. 4 ro 0b mwi enable. hardwired to 0b. 5 ro 0b palette snoop enable. hardwired to 0b. 6 rw 0b parity error response.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 686 9.4.4 status register (0x6; ro) each function has its own status register. unless explicitly specified, functionality is the same in all functions. 7 ro 0b wait cycle enable. hardwired to 0b. 8 rw 0b serr# enable. 9 ro 0b fast back-to-back enable. hardwired to 0b. 10 rw 0b interrupt disable 1 . 15:11 ro 0x0 reserved. 1. the interrupt disable register bit is a read-write bit that controls the ability of a pcie function to generate a legacy inte rrupt message. when set, functions are prevented from generating legacy interrupt messages. bits r/w initial value description 2:0 000b reserved. 3 ro 0b interrupt status 1 . 1. the interrupt status field is a ro field that indicates that an interrupt message is pending internally to the function. 4 ro 1b new capabilities. indicates that a function implements extended capabilities. the 82576 sets this bit, and implements a capabilities list, to indicate that it supports pci power management, message signaled interrupts (msi), enhanced message signaled interrupts (msi-x), vital product data (vpd), and the pcie extensions. 5 0b 66 mhz capable. hardwired to 0b. 6 0b reserved. 7 0b fast back-to-back capable. hardwired to 0b. 8 r/w1c 0b data parity reported. 10:9 00b devsel timing. hardwired to 0b. 11 r/w1c 0b signaled target abort. 12 r/w1c 0b received target abort. 13 r/w1c 0b received master abort. 14 r/w1c 0b signaled system error. 15 r/w1c 0b detected parity error.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 687 9.4.5 revision register (0x8; ro) the default revision id of the 82576 is 0x01. the value of the rev id is read from eeprom word 0x1e. note that lan 0 and lan 1 functions have the same revision id. 9.4.6 class code register (0x9; ro) the class code is a ro hard coded value that identifies the 82576?s functionality. ? lan 0, lan 1 - 0x020000/0x010000 - ethernet/scsi adapter 1 9.4.7 cache line size register (0xc; r/w) this field is implemented by pcie functions as a read-write field for legacy compatibility purposes but has no impact on any pcie function functionality. loaded from eeprom word 0x1a. all functions are initialized to the same value. in eeprom-less systems, the value is 0x10. 9.4.8 latency timer register (0xd; ro) not used. hardwired to zero. 9.4.9 header type register (0xe; ro) this indicates if a device is single function or multifunction. if a single lan function is the only active one then this field has a value of 0x00 to indicate a single function device. if other functions are enabled then this field has a value of 0x80 to indicate a multi-function device. the following table lists the different options to set the header type field. 9.4.10 bist register (0xf; ro) bist is not supported in the 82576. 1. selected according to bit 11 or 12 in word 0x1e in the eeprom for lan0 and lan 1 respectively. lan 0 enabled lan 1 enabled cross mode enable dummy function enable header type expected value 0 0 x x n/a (no function) 1 0 0 x 0x00 0 1 0 0 0x00 0 1 0 1 0x80 (dummy exist) 1 1 x x 0x80 (dual function) 1 0 1 0 0x00 1 0 1 1 0x80 (dummy exist) 0 1 1 x 0x00
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 688 9.4.11 base address registers (0x10:0x27; r/w) the base address registers (or bars) are used to map the 82576 register space of the various functions. 32-bit addresses may be used in one register for each memory mapping window or 64-bit addresses with 2 registers for each memory mapping window. 9.4.11.1 32-bit mapping this mapping is selected when bits 11:10 in word 0x21 in the eeprom are equal to 00. all base address registers have the following fields: bar addr. 31 4 3 2:1 0 0 0x10 memory bar (r/w - 31:17; 0b - 16:4) 0 00 0 1 0x14 flash bar (r/w - 31:23/16; 0b - 22/15:4) note: see remark regarding flash size. 000 0 2 0x18 io bar (r/w - 31:5; 0b - 4:1) 0 1 3 0x1c msi-x bar (r/w - 31:14; 0b - 13:4). 0 00 0 4 0x20 reserved (read as all 0b?s) 5 0x24 reserved (read as all 0b?s) bit(s) r/w initial value description 0 r 0b for memory 1b for i/o 0b = indicates memory space. 1b = indicates i/o. 2:1 r 00b memory type. indicates the address space size. 00b = 32-bit. 3 r 0b prefetch memory. 0b = non-prefetchable space 1b = prefetchable space this bit should be set only on systems that do not generate prefetchable cycles. this bit is read from eeprom word 0x21 bit 9 31:4 r/w 0x0 memory address space. read/write bits and hardwired to 0b. depends on the memory mapping window sizes: lan memory spaces are 128 kb. lan flash spaces can be 64 kb, up to 8 mb in powers of two. mapping window size is set by the eeprom word 0x0f. msi-x memory space is 16 kb. io address space is 32 bytes
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 689 9.4.11.2 64-bit mapping without i/o bar this mapping is selected when bits 11:10 in word 0x21 in the eeprom are equal to 10. all even base address registers have the following fields: all odd base address registers have the following fields: bar addr. 31 4 3 2:1 0 0 0x10 memory bar (r/w - 31:17; 0b - 16:4) 0 00 0 1 0x14 memory bar high word (r/w - 31:0) 2 0x18 flash bar (r/w - 31:23/16; 0b - 22/15:4) note: see remark regarding flash size. 000 0 3 0x1c flash bar high word (r/w - 31:0) 4 0x20 msi-x bar (r/w - 31:14; 0b - 13:4) 0 00 0 5 0x24 msi-x bar high word (r/w - 31:0) bit(s) r/w initial value description 0 r 0b 0b = indicates memory space. 1 2:1 r 10b memory type. indicates the address space size. 10b = 64-bit. 3 r 0b prefetch memory. 0b = non-prefetchable space 1b = prefetchable space this bit should be set only on systems that do not generate prefetchable cycles. this bit is read from eeprom word 0x21 bit 9 31:4 r/w 0x0 memory address space. read/write bits and hardwired to 0b. depends on the memory mapping window sizes: lan memory spaces are 128 kb. lan flash spaces can be 64 kb, up to 8 mb in powers of two. mapping window size is set by the eeprom word 0x0f. msi-x memory space is 16 kb. bit(s) r/w initial value description 31:0 r/w 0x0 memory address space high bytes.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 690 9.4.11.3 64-bit mapping without flash bar this mapping is selected when bits 11:10 of word 0x21 in the eeprom are equal to 11. all even base address registers have the following fields: all odd base address registers have the following fields: bar addr. 31 4 3 2 1 0 0 0x10 memory bar (r/w - 31:17; 0b - 16:4). 0 00 0 1 0x14 memory bar high word (r/w - 31:0). 2 0x18 io bar (r/w - 31:5; 0b - 4:1). 0 1 3 0x1c reserved. 4 0x20 msi-x bar (r/w - 31:14; 0b - 13:4). 0 00 0 5 0x24 msi-x bar high word (r/w - 31:0). bit(s) r/w initial value description 0 r 0b 0b = indicates memory space. 1b = indicates i/o space. 2:1 r 10b - memory 00b- io memory type. indicates the address space size. 10b = 64-bit for memory bars 00b = 32-bit for io bar 3 r 0b prefetch memory. 0b = non-prefetchable space. 1b = prefetchable space. this bit should be set only on systems that do not generate prefetchable cycles. this bit is read from eeprom word 0x21 bit 9. 31:4 r/w 0x0 memory address space. read/write bits and hardwired to 0b. depends on the memory mapping window sizes: ? lan memory spaces are 128 kb. ? lan flash spaces can be 64 kb, up to 8 mb in powers of two. mapping window size is set by the eeprom word 0x0f. ? msi-x memory space is 16 kb. ? io address space is 32 bytes. bit(s) r/w initial value description 31:0 r/w 0x0 memory address space high bytes.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 691 9.4.12 cardbus cis register (0x28; ro) not used. hardwired to zero. 9.4.13 subsystem vendor id register (0x2c; ro) this value can be loaded automatically from the eeprom address 0x0c at power up or reset. a value of 0x8086 is the default for this field at power up if the eeprom does not respond or is not programmed. all functions are initialized to the same value. 9.4.14 subsystem id register (0x2e; ro) this value can be loaded automatically from eeprom address 0x0b at power up with a default value of 0x0000. 9.4.15 expansion rom base address register (0x30; ro) this register is used to define the address and size information for boot-time access to the optional flash memory. only the lan 0/lan 1 functions can have this window. it is enabled by the eeprom words 0x24 and 0x14 for lan 0 and lan 1, respectively. this register returns a zero value for functions without an expansion rom window. mapping window mapping description memory bar the internal registers and memories are accessed as direct memory mapped offsets from the base address register. software can access dword or 64 bytes. flash bar the external flash can be accessed using direct memory mapped offsets from the flash base address register. software can access byte, word, dword or 64 bytes. i/obar all internal registers, memories, and flash can be accessed using i/o operations. there are two 4-byte registers in the i/o mapping window: addr reg and data reg. software can access byte, word or dword. msi-x bar the internal registers and memories are accessed as direct memory mapped offsets from the base address register. software can access dword or 64 bytes. bit(s) r/w initial value description 0 r/w 0b enable. 1b = enables expansion rom access. 0b = disables expansion rom access. 10:1 r 0x0 reserved. always read as 0b. writes are ignored. 31:11 r/w 0x0 address. r/w bits and hardwired to 0b. depends on the memory mapping window size. the lan expansion rom spaces can be either 64 kb, up to 8 mb in powers of two. mapping window size is set by the eeprom word 0x0f.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 692 9.4.16 cap_ptr register (0x34; ro) the capabilities pointer field (cap_ptr) is an 8-bit field that provides an offset in the function's pci configuration space for the location of the first item in the capabilities linked list (cll). the 82576 sets this bit and implements a capabilities list to indicate that it supports pci power management, message signaled interrupts (msis), and pcie extended capabilities. its value is 0x40, which is the address of the first entry: pci power management. 9.4.17 interrupt line register (0x3c; rw) read/write register programmed by software to indicate which of the system interrupt request lines this 82576's interrupt pin is bound to. see the pci definition for more details. each of the pci functions has its own register. 9.4.18 interrupt pin register (0x3d; ro) read only register. ? lan 0 / lan 1 - a value of 0x1 / 0x2 / 0x3 / 0x4 indicates that this function implements legacy interrupt on inta / intb / intc / intd, respectively. loaded from eeprom word 0x24 / 0x14 for lan 0 and lan 1, respectively. note: if one of the ports is disabled, the remaining port uses inta, independent of the eeprom setting. 9.4.19 max_lat/min_gnt (0x3e; ro) not used. hardwired to zero. 9.5 pci capabilities the first entry of the pci capabilities link list is pointed by the cap_ptr register. the following table describes the capabilities supported by the 82576. 9.5.1 pci power management registers all fields are reset on full power-up. all of the fields except pme_en and pme_status are reset on exit from d3cold state. if aux power is not supplied, the pme_en and pme_status fields also reset on exit from d3cold state. address item next pointer 0x40-47 pci power management. 0x50 0x50-67 message signaled interrupt. 0x70 0x70-8b extended message signaled interrupt. 0xa0 0xa0-db pcie capabilities. 0xe0/0x00 1 1. the vpd area in the eeprom does not exist. in eeprom-less mode, the pcie capability is the last one. 0xe0-0xe7 vital product data capability. 0x00
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 693 see the detailed description for registers loaded from the eeprom at initialization time. some fields in this section depend on the power management ena bits in eeprom word 0x0a. 9.5.1.1 capability id register (0x40; ro) this field equals 0x01 indicating the linked list item as being the pci power management registers. 9.5.1.2 next pointer (0x41; ro) this field provides an offset to the next capability item in the capability list. its value of 0x50 points to the msi capability. 9.5.1.3 power management capabilities - pmc (0x42; ro) this field describes the 82576?s functionality at the power management states as described in the following table. 9.5.1.4 power management control / status register - pmcsr (0x44; r/w) this register is used to control and monitor power management events in the 82576. bits r/w default description 15:11 ro see value in description column pme_support. this five-bit field indicates the power states in which the function might assert pme#. its initial value is loaded from eeprom word 0x0a condition aaaaaaaaaaa functionality aaaaaaaaaaaaa a value pm dis in eeprom aaaaaa no pme at all states aaaaaaa aaa 00000b pm ena & no aux pwr aaa pme at d0 and d3hot aaaa aaaaa 01001b pm ena w aux pwr aaaaa pme at d0, d3hot and d3cold aaa 11001b 10 ro 0b d2_support. the 82576 does not support d2 state. 9 ro 0b d1_support. the 82576 does not support d1 state. 8:6 ro 000b aux current ? required current defined in the data register. 5 ro 1b dsi. the 82576 requires its device driver to be executed following transition to the d0 uninitialized state. 4 ro 0b reserved. 3 ro 0b pme_clock. disabled. hardwired to 0b. 2:0 ro 011b version. the 82576 complies with the pci pm specification, revision 1.2.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 694 9.5.1.5 bridge support extensions - pmcsr_bse (0x46; ro) this register is not implemented in the 82576. values are set to 0x00. 9.5.1.6 data register (0x47; ro) this optional register is used to report power consumption and heat dissipation. reported register is controlled by the data_select field in the pmcsr and the power scale is reported in the data_scale field in the pmcsr. the data of this field is loaded from the eeprom if power management is enabled in the eeprom or with a default value of 0x00. the values for the 82576 functions are read from eeprom word 0x22. bits r/w default description 15 r/w1c 0b (at power up) pme_status. this bit is set to 1b when the function detects a wake-up event independent of the state of the pme_en bit. writing a 1b clears this bit. 14:13 ro 01b data_scale. this field indicates the scaling factor to be used when interpreting the value of the data register. this field equals 01b (indicating 0.1 watt units) if power management is enabled in the eeprom and the data_select field is set to 0, 3, 4, 7, (or 8 for function 0). otherwise, this field equals 00b. 12:9 r/w 0000b data_select. this four-bit field is used to select which data is to be reported through the data register and data_scale field. these bits are writable only when power management is enabled via eeprom. 8 r/w 0b (at power up) pme_en. if power management is enabled in the eeprom, writing a 1b to this register enables wake up. if power management is disabled in the eeprom, writing a 1b to this bit has no affect and does not set the bit to 1b. 7:4 ro 000000b reserved 3 ro 0b no_soft_reset. this bit is always set to 0b to indicate that the 82576 performs an internal reset after a transition from d3hot to d0 via software control of the powerstate bits. configuration context is lost when performing the soft reset. after transitioning from the d3hot to the d0 state, full re-initialization sequence is needed to return the 82576 to d0 initialized. 2 ro 0b reserved for pcie. 1:0 r/w 00b power state this field is used to set and report the power state of a function as follows: 00b = d0 01b = d1 (cycle ignored if written with the value of 10b) ? d2 (cycle ignored if written with this value) 11b = d3 (cycle ignored if power management is not enabled in the eeprom)
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 695 for other data_select values, the data register output is reserved (0b). 9.5.2 msi configuration this structure is required for pcie functions. there are no changes to this structure. 9.5.2.1 capability id register (0x50; ro) this field equals 0x05 indicating the linked list item as being the msi registers. 9.5.2.2 next pointer register (0x51; ro) this field provides an offset to the next capability item in the capability list. its value of 0x70 points to the msi-x capability structure. 9.5.2.3 message control register (0x52; r/w) the register fields are described in the following table. there is a dedicated register per pci function to separately enable their msi. function d0 (consume/ dissipate) d3 (consume/ dissipate) common data select 0x0 / 0x4 0x3 / 0x7 0x8 function 0 eeprom addr 0x22 eeprom addr 0x22 eeprom addr 0x22 function 1 eeprom addr 0x22 eeprom addr 0x22 0x00 bits r/w default description 0 r/w 0b msi enable. if set to 1b, equals msi. in this case, the 82576 generates an msi for interrupt assertion instead of intx signaling. 3:1 ro 000b multiple message capable. the 82576 indicates a single requested message per each function. 6:4 ro 000b multiple message enable the 82576 returns 000b to indicate that it supports a single message per function. 7 ro 1b 64-bit capable. a value of 1b indicates that the 82576 is capable of generating 64-bit message addresses. 8 ro 1b msi per-vector masking. a value of 1b indicates that the 82576 is capable of per-vector masking. 15:9 ro 0b reserved. reads as 0b.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 696 9.5.2.4 message address low register (0x54; r/w) written by the system to indicate the lower 32 bits of the address to use for the msi memory write transaction. the lower two bits always return 0b regardless of the write operation. 9.5.2.5 message address high register (0x58; r/w) written by the system to indicate the upper 32-bits of the address to use for the msi memory write transaction. 9.5.2.6 message data register (0x5c; r/w) written by the system to indicate the lower 16 bits of the data written in the msi memory write dword transaction. the upper 16 bits of the transaction are written as 0b. 9.5.2.7 mask bits register (0x60; r/w) the mask bits and pending bits registers enable software to disable or defer message sending on a per- vector basis. as the 82576 supports only one message, only bit 0 of these register is implemented. 9.5.2.8 pending bits register (0x64; r/w) 9.5.3 msi-x configuration more than one msi-x capability structure per function is prohibited, but a function is permitted to have both an msi and an msi-x capability structure. in contrast to the msi capability structure, which directly contains all of the control/status information for the function's vectors, the msi-x capability structure instead points to an msi-x table structure and a msi-x pending bit array (pba) structure, each residing in memory space. each structure is mapped by a base address register (bar) belonging to the function, located beginning at 0x10 in configuration space. a bar indicator register (bir) indicates which bar, and a qword-aligned offset indicates where the structure begins relative to the base address associated with the bar. the bar is permitted to be either 32-bit or 64-bit, but must map to memory space. a function is permitted to map both structures with the same bar, or to map each structure with a different bar. bits r/w default description 0 r/w 0b msi vector 0 mask. if set, the 82576 is prohibited from sending msi messages. 31:1 ro 000b reserved. bits r/w default description 0 ro 0b if set, the 82576 has a pending msi message. 31:1 ro 000b reserved.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 697 the msi-x table structure, listed in table 8-19 , typically contains multiple entries, each consisting of several fields: message address, message upper address, message data, and vector control. each entry is capable of specifying a unique vector. the pba structure, described in the same section, contains the function's pending bits, one per table entry, organized as a packed array of bits within qwords. note that the last qword might not be fully populated. 9.5.3.1 capability id register (0x70; ro) this field equals 0x11 indicating the linked list item as being the msi-x registers. 9.5.3.2 next pointer register (0x71; ro) this field provides an offset to the next capability item in the capability list. its value of 0xa0 points to the pcie capability. 9.5.3.3 message control register (0x72; r/w) the register fields are described in the following table. there is a dedicated register per pci function to separately enable their msi. bits r/w default description 10:0 ro 0x009 1 1. default value is read from the eeprom ts - table size. system software reads this field to determine the msi-x table size n, which is encoded as n-1. for example, a returned value of 0x00f indicates a table size of 16. 13:11 ro 0b reserved. always returns 000b on read. write operation has no effect. 14 r/w 0b fm - function mask. if set to 1b, all of the vectors associated with the function are masked, regardless of their per-vector mask bit states. if set to 0b, each vector?s mask bit determines whether the vector is masked or not. setting or clearing the msi-x function mask bit has no effect on the state of the per- vector mask bits. 15 r/w 0b en - msi-x enable. if set to 1b and the msi enable bit in the msi message control (mmc) register is 0b, the function is permitted to use msi-x to request service and is prohibited from using its intx# pin. system configuration software sets this bit to enable msi-x. a software device driver is prohibited from writing this bit to mask a function?s service request. if set to 0b, the function is prohibited from using msi-x to request service.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 698 9.5.3.4 table offset register (0x74; r/w) 9.5.3.5 pba offset register (0x78; r/w) to request service using a given msi-x table entry, a function performs a dword memory write transaction using the: ? contents of the message data field entry for data. ? contents of the message upper address field for the upper 32 bits of address. ? contents of the message address field entry for the lower 32 bits of address. a memory read transaction from the address targeted by the msi-x message produces undefined results. msi-x table entries and pending bits are each numbered 0 through n-1, where n-1 is indicated by the table size field in the mmc register. for a given arbitrary msi-x table entry k, its starting address can be calculated with the formula: entry starting address = table base + k*16 for the associated pending bit k, its address for qword access and bit number within that qword can be calculated with the formulas: qword address = pba base + (k div 64)*8 qword bit# = k mod 64 software that chooses to read pending bit k with dword accesses can use these formulas: bits r/w default description 31:3 ro 0x000 table offset. used as an offset from the address contained by one of the function?s bars to point to the base of the msi-x table. the lower three table bir bits are masked off (set to zero) by software to form a 32-bit qword-aligned offset. 2:0 ro 0x3 table bir. indicates which one of a function?s bars, located beginning at 0x10 in configuration space, is used to map the function?s msi-x table into memory space. a bir value of 3 indicates that the table is mapped in bar 3. if 64 bit mmio mapping is used, this value is set to 4. bits r/w default description 31:3 ro 0x400 pba offset. used as an offset from the address contained by one of the function?s bars to point to the base of the msi-x pba. the lower three pba bir bits are masked off (set to zero) by software to form a 32-bit qword-aligned offset. 2:0 ro 0x3 pba bir. indicates which one of a function?s base address registers, located beginning at 10h in configuration space, is used to map the function?s msi-x pba into memory space. a bir value of 3 indicates that the pba is mapped in bar 3. if 64 bit mmio mapping is used, this value is set to 4.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 699 dword address = pba base + (k div 32)*4 dword bit# = k mod 32 9.5.4 vital product data registers the 82576 supports access to a vpd structure stored in the eeprom using the following set of registers. 9.5.4.1 capability id register (0xe0; ro) this field equals 0x3 indicating the linked list item as being the vpd registers. 9.5.4.2 next pointer register (0xe1; ro) offset to the next capability item in the capability list. a 0x00 value indicates that it is the last item in the capability-linked list. 9.5.4.3 vpd address register (0xe2; rw) dword-aligned byte address of the vpd area in the eeprom to be accessed. the register is read/write with the initial value at power-up indeterminate. 9.5.4.4 vpd data register (0xe4; rw) this register contains the vpd read/write data. bits r/w default description 14:0 rw x address. dword-aligned byte address of the vpd area in the eeprom to be accessed. the register is read/write with the initial value at power-up indeterminate. the two lsbs are ro as zero. this is the address relative to the start of the vpd area. as the maximal size supported by the 82576 is 256 bytes, bits 14:8 should always be zero. 15 rw 0b f. a flag used to indicate when the transfer of data between the vpd data register and the storage component completes. the flag register is written when the vpd address register is written. 0b = read. set by hardware when data is valid. 1b = write. cleared by hardware when data is written to the eeprom. the vpd address and data should not be modified before the action completes. bits r/w default description 31:0 rw x vpd data. vpd data can be read or written through this register. the lsb of this register (at offset four in this capability structure) corresponds to the byte of vpd at the address specified by the vpd address register. the data read from or written to this register uses the normal pci byte transfer capabilities. four bytes are always transferred between this register and the vpd storage component. reading or writing data outside of the vpd space in the storage component is not allowed. in a write access, the data should be set before the address and the flag is set.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 700 9.5.5 pcie configuration registers pcie provides two mechanisms to support native features: ? pcie defines a pci capability pointer indicating support for pcie. ? pcie extends the configuration space beyond the 256 bytes available for pci to 4096 bytes. the 82576 implements the pcie capability structure for endpoint functions as follows: 9.5.5.1 capability id register (0xa0; ro) this field equals 0x10 indicating the linked list item as being the pcie capabilities registers. 9.5.5.2 next pointer register (0xa1; ro) offset to the next capability item in the capability list. its value of 0xe0 points to the vpd structure. if vpd is disabled, a value of 0x00 value indicates that it is the last item in the capability-linked list. 9.5.5.3 pcie cap register (0xa2; ro) the pcie capabilities register identifies the pcie device type and associated capabilities. this is a read only register identical to all functions. 9.5.5.4 device capability register (0xa4; rw) this register identifies the pcie device specific capabilities. it is a read only register with the same value for the two lan functions and to all other functions. bits r/w default description 3:0 ro 0010b capability version. indicates the pcie capability structure version number. the 82576 supports both version 1 and version 2 as loaded from the pcie capability version bit in the eeprom. 7:4 ro 0000b device/port type. indicates the type of pcie functions. all functions are a native pci function with a value of 0000b. 8 ro 0b slot implemented. the 82576 does not implement slot options therefore this field is hardwired to 0b. 13:9 ro 00000b interrupt message number. the 82576 does not implement multiple msis per function, therefore this field is hardwired to 0x0. 15:14 ro 00b reserved.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 701 9.5.5.5 device control register (0xa8; rw) this register controls the pcie specific parameters. there is a dedicated register per each function. bits r/w default description 2:0 ro 010b max payload size supported. this field indicates the maximum payload that the 82576 can support for tlps. it is loaded from the eeprom?s pcie init configuration 3 word, 0x1a (with a default value of 512 bytes). 4:3 ro 00b phantom function supported. not supported by the 82576. 5 ro 0b extended tag field supported. max supported size of the tag field. the 82576 supported 5-bit tag field for all functions. 8:6 ro 110b endpoint l0s acceptable latency. this field indicates the acceptable latency that the 82576 can withstand due to the transition from the l0s state to the l0 state. all functions share the same value loaded from the eeprom pcie init configuration 1 word, 0x18. 11:9 ro 110b endpoint l1 acceptable latency. this field indicates the acceptable latency that the 82576 can withstand due to the transition from the l1 state to the l0 state. all functions share the same value loaded from the eeprom pcie init configuration 1 word, 0x18. 12 ro 0b attention button present. hardwired in the 82576 to 0b for all functions. 13 ro 0b attention indicator present. hardwired in the 82576 to 0b for all functions. 14 ro 0b power indicator present. hardwired in the 82576 to 0b for all functions. 15 ro 1b role-based error reporting. this bit, when set, indicates that the 82576 implements the functionality originally defined in the error reporting ecn for pcie base specification 1.0a and later incorporated into pcie base specification 1.1. set to 1b in the 82576. 17:16 ro 000b reserved. 25:18 ro 0x00 slot power limit value. hardwired in the 82576 to 0x00 for all functions, as the 82576 consumes less than the 25w allowed for it?s form factor. 27:26 ro 00b slot power limit scale. hardwired in the 82576 to 0b for all functions.as the 82576 consumes less than the 25w allowed for it?s form factor. 28 ro 1b function level reset (flr) capability. a value of 1b indicates the function supports the optional flr mechanism. 31:29 ro 000b reserved.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 702 bits r/w default description 0 rw 0b correctable error reporting enable. enable error report. 1 rw 0b non-fatal error reporting enable. enable error report. 2 rw 0b fatal error reporting enable. enable error report. 3 rw 0b unsupported request reporting enable. enable error report. 4 rw 1b enable relaxed ordering. if this bit is set, the 82576 is permitted to set the relaxed ordering bit in the attribute field of write transactions that do not need strong ordering. for more details, refer to the description about the ro_dis bit in the ctrl_ext register bit insee section 8.2.3 . 7:5 rw 000b (128 bytes) max payload size. this field sets maximum tlp payload size for the 82576 functions. as a receiver, the 82576 must handle tlps as large as the set value. as a transmitter, the 82576 must not generate tlps exceeding the set value. the max payload size supported in the 82576 capabilities register indicates permissible values that can be programmed. 8 ro 0b extended tag field enable. not implemented in the 82576. 9 ro 0b phantom functions enable. not implemented in the 82576. 10 rw 0b auxiliary power pm enable. when set, enables the 82576 to draw aux power independent of pme aux power. the 82576 is a multi function device, therefore it is allowed to draw aux power if at least one of the functions has this bit set. 11 rw 1b enable no snoop. snoop is gated by nonsnoop bits in the gcr register in the csr space. 14:12 rw 010b / 000b max read request size - this field sets maximum read request size for the device as a requester. 000b = 128 bytes (the default value for non lan functions). 001b = 256 bytes. 010b = 512 bytes. (the default value for the lan devices). 011b = 1 kb. 100b = 2 kb. 101b = reserved. 110b = reserved. 111b = reserved. 15 rw 0b initiate function level reset. a write of 1b initiates an flr to the function. the value read by software from this bit is always 0b.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 703 9.5.5.6 device status register (0xaa; rw1c) this register provides information about pcie device?s specific parameters. there is a dedicated register per each function. bits r/w default description 0 rw1c 0b correctable detected. indicates status of correctable error detection. 1 rw1c 0b non-fatal error detected. indicates status of non-fatal error detection. 2 rw1c 0b fatal error detected. indicates status of fatal error detection. 3 rw1c 0b unsupported request detected. indicates that the 82576 received an unsupported request. this field is identical in all functions. the 82576 cannot distinguish which function caused an error. 4 ro 0b aux power detected. if aux power is detected, this field is set to 1b. it is a strapping signal from the periphery identical for all functions. reset on internal_power_on_reset and pe_rst_n only. 5 ro 0b transaction pending. indicates whether the 82576 has any transaction pending. transactions include completions for any outstanding non-posted request for all used traffic classes. 15:6 ro 0x00 reserved.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 704 9.5.5.7 link cap register (0xac; ro) this register identifies pcie link specific capabilities. this is a read only register identical to all functions. bits r/w default description 3:0 ro 0001b max link speed. the 82576 indicates a maximum link speed of 2.5 gb/s. 9:4 ro 0x4 max link width. indicates the maximum link width. the 82576 can support by 1-, by 2- and by 4-link width. the field is loaded from the eeprom pcie init configuration 3, word 1ah, with a default value of four lanes. relevant encoding: 000000b = reserved. 000001b = x1. 000010b = x2. 000100b = x4. 11:10 ro 11b active state link pm support. indicates the level of active state power management supported in the 82576. the encoding is: 00b = reserved. 01b = l0s entry supported. 10b = reserved. 11b = l0s and l1 supported. this field is loaded from the eeprom pcie init configuration 3 word, 0x1a. 14:12 ro 101b (1 ? s ? 2 ? s) when non common clock 110b (2 ? s ? 4 ? s) when common clock l0s exit latency. indicates the exit latency from l0s to l0 state. 000b = less than 64ns. 001b = 64ns ? 128ns. 010b = 128ns ? 256ns. 011b = 256ns - 512ns. 100b = 512ns - 1 ? s. 101b = 1 ? s ? 2 ? s. 110b = 2 ? s ? 4 ? s. 111b = reserved. if the 82576 uses a common clock the value of this field is loaded from pcie init config 1 word, 0x18, bits [2:0]. if the 82576 uses a separate clock, the value of this field is loaded from pcie init config 1 word, 0x18, bits [5:3].
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 705 9.5.5.8 link control register (0xb0; ro) this register controls pcie link specific parameters. there is a dedicated register per each function. 17:15 ro 110b (32-64 ? s) l1 exit latency. indicates the exit latency from l1 to l0 state. this field is loaded from the eeprom pcie init configuration 1 word, 0x18. 000b = less than 1 ? s. 001b = 1 ? s - 2 ? s. 010b = 2 ? s - 4 ? s. 011b = 4 ? s - 8 ? s. 100b = 8 ? s - 16 ? s. 101b = 16 ? s - 32 ? s. 110b = 32 ? s - 64 ? s. 111b = l1 transition not supported. 18 ro 0b clock power management status. not supported in the 82576. ro as zero. 19 ro 0b surprise down error reporting capable status. not supported in the 82576. ro as zero 20 ro 0b data link layer link active reporting capable status. not supported in the 82576. ro as zero. 21 ro 0b link bandwidth notification capability status. not supported in the 82576. ro as zero. 23:22 ro 00b reserved. 31:24 hwinit 0x0 port number. the pcie port number for the given pcie link. field is set in the link training phase. bits r/w default description 1:0 rw 00b active state link pm control. this field controls the active state of power management that is supported on the link. link pm functionality is determined by the lowest common denominator of all functions. the encoding is: 00b = pm disabled. 01b = l0s entry supported. 10b = reserved. 11b = l0s and l1 supported. 2 ro 0b reserved. 3 rw 0b read completion boundary. 4 ro 0b link disable. not applicable for endpoint devices; hardwired to 0b. 5 ro 0b retrain clock. not applicable for endpoint devices; hardwired to 0b. bits r/w default description
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 706 9.5.5.9 link status register (0xb2; ro) this register provides information about pcie link specific parameters. this is a read only register identical to all functions. 6 rw 0b common clock configuration. when this bit is set, it indicates that the 82576 and the component at the other end of the link are operating with a common reference clock. a value of 0b indicates that both operate with an asynchronous clock. this parameter affects the l0s exit latencies. 7 rw 0b extended synch. when this bit is set, it forces an extended tx of a fts ordered set in fts and an extra ts1 at exit from l0s prior to enter l0. 8 ro 0b enable clock power management. not supported in the 82576. ro as zero. 9 ro 0b hardware autonomous width disable. not supported in the 82576. ro as zero. 10 ro 0b link bandwidth management interrupt enable. not supported in the 82576. ro as zero. 11 ro 0b link autonomous bandwidth interrupt enable. not supported in the 82576. ro as zero. 15:12 ro 0000b reserved. bits r/w default description 3:0 ro 0001b link speed. indicates the negotiated link speed. note that the default setting (0001b) is the only defined speed which is 2.5 gb/s. 9:4 ro 000001b negotiated link width. indicates the negotiated width of the link. relevant encoding for the 82576 are: 000001b = x1 000010b = x2 000100b = x4 10 ro 0b reserved. (was: link training error) 11 ro 0b link training. indicates that link training is in progress. 12 hwinit 1b slot clock configuration. when set, indicates that the 82576 uses the physical reference clock that the platform provides on the connector. this bit must be cleared if the 82576 uses an independent clock. the slot clock configuration bit is loaded from the slot_clock_cfg eeprom bit.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 707 9.5.5.10 reserved registers (0xb4-0xc0; ro) unimplemented reserved registers not relevant to pcie endpoint. the two following registers are implemented only if the capability version is 2. 9.5.5.11 device cap 2 register (0xc4; ro) this register identifies pcie device specific capabilities. it is a read only register with the same value for all functions. . 13 ro 0b data link layer link active. not supported in the 82576. ro as zero. 14 ro 0b link bandwidth management status. not supported in the 82576. ro as zero. 15 ro 0b reserved. bit location r/w default description 3:0 ro 1111b completion timeout ranges supported. this field indicates 82576 support for the optional completion timeout programmability mechanism. this mechanism enables system software to modify the completion timeout value. four time value ranges are defined: range a = 50 ? s to 10 ms range b = 10 ms to 250 ms range c = 250 ms to 4 s range d = 4 s to 64 s bits are set according to the table in section 9.5.5.12 to show the timeout value ranges that are supported. 0000b = completion timeout programming not supported. the 82576 must implement a timeout value in the range 50 ? s to 50 ms. 0001b = range a. 0010b = range b. 0011b = ranges a & b. 0110b = ranges b & c. 0111b = ranges a, b & c. 1110b = ranges b, c & d. 1111b = ranges a, b, c & d. all other values are reserved. it is strongly recommended that the completion timeout mechanism not expire in less than 10 ms
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 708 9.5.5.12 device control 2 register (0xc8; rw) this register controls pcie specific parameters. there is a dedicated register per each function. note: the device control 2 register should only be written during initialization. when a port is enabled to transmit or receive data, this register should not be written even if the value is not changed. 4 ro 1b completion timeout disable supported. a value of 1b indicates support for the completion timeout disable mechanism. 5 ro 0b ari forwarding supported. applicable only to switch downstream ports and root ports; must be set to 0b for other function types. 15:5 ro 0x0 reserved. set to 0x0.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 709 9.6 pcie extended configuration space pcie extended configuration space is located in a flat memory-mapped address space. pcie extends the configuration space beyond the 256 bytes available for pci to 4096 bytes. the 82576 decodes an additional 4-bits (bits 27:24) to provide the additional configuration space as shown in table 9-6 . pcie reserves the remaining 4 bits (bits 31:28) for future expansion of the configuration space beyond 4096 bytes. bit location r/w default description 3:0 rw 0000b completion timeout value. see section 3.1.3.2.3, completion timeout period for implemented values. in devices that support completion timeout programmability, this field enables system software to modify the completion timeout value. encoding: 0000b = default range: 50 ? s to 50 ms. it is strongly recommended that the completion timeout mechanism not expire in less than 10 ms. values available if range a (50 ? s to 10 ms) programmability range is supported: 0001b = 50 ? s to 100 ? s. 0010b = 1 ms to 10 ms. values available if range b (10 ms to 250 ms) programmability range is supported: 0101b = 16 ms to 55 ms. 0110b = 65 ms to 210 ms. values available if range c (250 ms to 4 s) programmability range is supported: 1001b = 260 ms to 900 ms. 1010b = 1 s to 3.5 s. values available if the range d (4 s to 64 s) programmability range is supported: 1101b = 4 s to 13 s. 1110b = 17 s to 64 s. values not defined are reserved. software is permitted to change the value in this field at any time. for requests already pending when the completion timeout value is changed, hardware is permitted to use either the new or the old value for the outstanding requests and is permitted to base the start time for each request either when this value was changed or when each request was issued. the default value for this field is 0000b. 4 rw 0b completion timeout disable. when set to 1b, this bit disables the completion timeout mechanism. software is permitted to set or clear this bit at any time. when set, the completion timeout detection mechanism is disabled. if there are outstanding requests when the bit is cleared, it is permitted but not required for hardware to apply the completion timeout mechanism to the outstanding requests. if this is done, it is permitted to base the start time for each request on either the time this bit was cleared or the time each request was issued. the default value for this bit is 0b. 5 ro 0b alternative rid interpretation (ari) forwarding enable. applicable only to switch devices. 15:5 ro 0x0 reserved.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 710 the configuration address for a pcie device is computed using a pci-compatible bus, device, and function numbers as follows. pcie extended configuration space is allocated using a linked list of optional or required pcie extended capabilities following a format resembling pci capability structures. the first pcie extended capability is located at offset 0x100 in the function configuration space. the first dword of the capability structure identifies the capability/version and points to the next capability. the 82576 supports the following pcie extended capabilities. 9.6.1 advanced error reporting (aer) capability the pcie aer capability is an optional extended capability to support advanced error reporting. the following table lists the pcie aer extended capability structure for pcie functions. table 9-6. pcie extended configuration space 31 28 27 20 19 15 14 12 11 2 1 0 0000b bus # device # fun # register address (offset) 00b table 9-7. pcie extended capability structure capability offset next header advanced error reporting capability 0x100 0x140/0x150/0x000 1 1. depends on eeprom settings enabling the serial numbers and ari/iov structures. serial number 2 2. not available in eeprom-less systems. 0x140 0x150/0x000 1 alternative rid interpretation (ari) 0x150 0x160 iov support 0x160 0x000 table 9-8. pcie aer extended capability structure register offset field description 0x100 pcie cap id pcie extended capability id. 0x104 uncorrectable error status reports error status of individual uncorrectable error sources on a pcie device. 0x108 uncorrectable error mask controls reporting of individual uncorrectable errors by device to the host bridge via a pcie error message. 0x10c uncorrectable error severity controls whether an individual uncorrectable error is reported as a fatal error. 0x110 correctable error status reports error status of individual correctable error sources on a pcie device. 0x114 correctable error mask controls reporting of individual correctable errors by device to the host bridge via a pcie error message. 0x118 advanced error capabilities and control register identifies the bit position of the first uncorrectable error reported in the uncorrectable error status register.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 711 9.6.1.1 pcie cap id register (0x100; ro) 9.6.1.2 uncorrectable error status register (0x104; r/w1cs) the uncorrectable error status register reports error status of individual uncorrectable error sources on a pcie function. an individual error status bit that is set to 1b indicates that a particular error occurred; software can clear an error status by writing a 1b to the respective bit. 0x11c: 0x128 header log captures the header for the transaction that generated an error. bit location r/w default value description 15:0 ro 0x0001 extended capability id pcie extended capability id indicating aer capability. 19:16 ro 0x1 version number. pcie aer extended capability version number. 31:20 ro 0x150 next capability pointer. next pcie extended capability pointer. a value of 0x140 points to the serial id capability. in eeprom-less systems or when serial id is disabled in the eeprom, the next pointer is 0x150 and points to the ari capability structure. if ari/iov and serial id are disabled in the eeprom this field is 0x0. bit location r/w default value description 3:0 ro 0x0 reserved. 4 r/w1cs 0b data link protocol error status. 5 ro 0b surprise down error status (optional) not supported in the 82576. 11:6 ro 0x0 reserved. 12 r/w1cs 0b poisoned tlp status. 13 r/w1cs 0b flow control protocol error status. 14 r/w1cs 0b completion timeout status. 15 r/w1cs 0b completer abort status. 16 r/w1cs 0b unexpected completion status. 17 r/w1cs 0b receiver overflow status. 18 r/w1cs 0b malformed tlp status. 19 ro 0b ecrc error status. not supported in the 82576. table 9-8. pcie aer extended capability structure (continued)
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 712 9.6.1.3 uncorrectable error mask register (0x108; rws) the uncorrectable error mask register controls reporting of individual uncorrectable errors by device to the host bridge via a pcie error message. a masked error (respective bit set in mask register) is not reported to the host bridge by an individual device. there is a mask bit per bit in the uncorrectable error status register. 9.6.1.4 uncorrectable error severity register (0x10c; rws) the uncorrectable error severity register controls whether an individual uncorrectable error is reported as a fatal error. an uncorrectable error is reported as fatal when the corresponding error bit in the severity register is set. if the bit is cleared, the corresponding error is considered non-fatal. 20 r/w1cs 0b unsupported request error status. when caused by a function that claims a tlp 21 ro 0b acs violation status. not supported in the 82576. 31:22 ro 0x0 reserved. bit location r/w default value description 3:0 ro 0x0 reserved. 4 rws 0b data link protocol error mask. 11:5 ro 0x0 reserved. 12 rws 0b poisoned tlp mask. 13 rws 0b flow control protocol error mask. 14 rws 0b completion timeout mask. 15 rws 0b completer abort mask. 16 rws 0b unexpected completion mask. 17 rws 0b receiver overflow mask. 18 rws 0b malformed tlp mask. 19 ro 0b reserved. 20 rws 0b unsupported request error mask. 31:21 ro 0x0 reserved. bit location r/w default value description 3:0 ro 0001b reserved. 4 rws 1b data link protocol error severity. 11:5 ro 0x0 reserved. 12 rws 0b poisoned tlp severity. 13 rws 1b flow control protocol error severity.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 713 9.6.1.5 correctable error status register (0x110; r/w1cs) the correctable error status register reports error status of individual correctable error sources on a pcie device. when an individual error status bit is set to 1b, it indicates that a particular error occurred; software can clear an error status by writing a 1b to the respective bit. 9.6.1.6 correctable error mask register (0x114; rws) the correctable error mask register controls reporting of individual correctable errors by device to the host bridge via a pcie error message. a masked error (respective bit set in mask register) is not reported to the host bridge by an individual device. there is a mask bit per bit in the correctable error status register. 14 rws 0b completion timeout severity. 15 rws 0b completer abort severity. 16 rws 0b unexpected completion severity. 17 rws 1b receiver overflow severity. 18 rws 1b malformed tlp severity. 19 ro 0b reserved. 20 rws 0b unsupported request error severity. 31:21 ro 0x0 reserved. bit location r/w default value description 0 r/w1cs 0b receiver error status. 5:1 ro 0x0 reserved. 6 r/w1cs 0b bad tlp status. 7 r/w1cs 0b bad dllp status. 8 r/w1cs 0b replay_num rollover status. 11:9 ro 000 reserved. 12 r/w1cs 0b replay timer timeout status. 13 r/w1cs 0b advisory non-fatal error status. 31:14 ro 0x0 reserved. bit location r/w default value description 0 rws 0b receiver error mask. 5:1 ro 0x0 reserved. 6 rws 0b bad tlp mask. 7 rws 0b bad dllp mask. 8 rws 0b replay_num rollover mask.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 714 9.6.1.7 advanced error capabilities and control register (0x118; ro) 9.6.1.8 header log register (0x11c:0x128; ro) the header log register captures the header for the transaction that generated an error. this register is 16 bytes in length. 9.6.2 serial number the pcie device serial number capability is an optional extended capability implemented by the 82576. the device serial number is a read-only 64-bit value that is unique for a given pcie device. both functions return the same device serial number value. 9.6.2.1 device serial number enhanced capability header register (0x140; ro) the following table lists the allocation of register fields in the device serial number enhanced capability header. it also lists the respective bit definitions. the extended capability id for the device serial number capability is 0x0003. 11:9 ro 000b reserved. 12 rws 0b replay timer timeout mask. 13 rws 0b advisory non-fatal error mask. 31:14 ro 0x0 reserved. bit location r/w default value description 4:0 ro 0x0 vector pointing to the first recorded error in the uncorrectable error status register. 5 ro 0b ecrc generation capable. this bit indicates that the 82576 is capable of generating ecrc. tied to 0b in the 82576. 6 ro 0b ecrc generation enable. this bit, when set, enables ecrc generation. tied to 0b in the 82576. 7 ro 0b ecrc check capable. this bit indicates that the 82576 is capable of checking ecrc. tied to 0b in the 82576. 8 ro 0b ecrc check enable. this bit, when set, enables ecrc checking. tied to 0b in the 82576. 31:9 ro 0x0 reserved. bit location r/w default value description 127:0 ro 0b header of the packet in error (tlp or dllp).
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 715 9.6.2.2 serial number register (0x144:0x148; ro) the serial number register is a 64-bit field that contains the ieee defined 64-bit extended unique identifier (eui-64). table 9-9 lists the allocation of register fields in the serial number register. table 9- 9 also lists the respective bit definitions. serial number definition in the 82576: serial number uses the mac address according to the following definition: bit(s) location default value r/w description 15:0 0x0003 ro pcie extended capability id. this field is a pci-sig defined id number that indicates the nature and format of the extended capability. extended capability id for the device serial number. 19:16 0x1 ro capability version. this field is a pci-sig defined version number that indicates the version of the current capability structure. 31:20 0x150 ro next capability offset. this field contains the offset to the next pcie capability structure or 0x000 if no other items exist in the linked list of capabilities. the value of this field is 0x150 to point to the ari capability structure. if ari/iov and serial id are disabled in the eeprom, then this field is 0x0. table 9-9. serial number register 31:0 serial number register (lower dword). serial number register (upper word). 63:32 table 9-10. sn definition bit(s) location r/w description 63:0 ro pcie device serial number. this field contains the ieee defined 64-bit extended unique identifier (eui-64?). this identifier includes a 24-bit company id value assigned by ieee registration authority and a 40-bit extension identifier assigned by the manufacturer. table 9-11. sn and mac address field extension identifier company id order addr+0 addr+1 addr+2 addr+3 addr+4 addr+5 addr+6 addr+7 most significant byte least significant byte
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 716 the serial number can be constructed from the 48-bit mac address in the following form: the mac label in this case is 0xffff. for example, assume that the company id is (intel) 00-a0-c9 and the extension identifier is 23-45-67. in this case, the 64-bit serial number is: the mac address is the function 0 mac address as loaded from the eeprom into the ral and rah registers. the translation from eeprom words 0 to 2 to the serial number is as follows: ? serial number addr+0 = eeprom byte 5 ? serial number addr+1 = eeprom byte 4 ? serial number addr+2 = eeprom byte 3 ? serial number addr+3,4 = 0xff 0xff ? serial number addr+5 = eeprom byte 2 ? serial number addr+6 = eeprom byte 1 ? serial number addr+7 = eeprom byte 0 the official document defining eui-64 is: http://standards.ieee.org/regauth/oui/tutorials/eui64.html 9.6.3 ari capability structure in order to enable more than eight functions per end point without requesting an internal switch (typically needed in virtualization scenarios), the pci sig defines a new capability that enables a different interpretation of the bus , device , and function fields. the alternate requester id interpretation (ari) capability structure is as follows: most significant bit least significant bit table 9-12. sn constructed from 48-bit mac address field extension identifier mac label company id order addr+0 addr+1 addr+2 addr+3 addr+4 addr+5 addr+6 addr+7 most significant bytes least significant byte most significant bit least significant bit table 9-13. example 64-bit sn field extension identifier mac label company id order addr+0 addr+1 addr+2 addr+3 addr+4 addr+5 addr+6 addr+7 67 45 23 ff ff c9 a0 00 most significant byte least significant byte most significant bit least significant bit table 9-11. sn and mac address (continued)
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 717 9.6.3.1 pcie ari header register (0x150; ro) 9.6.3.2 pcie ari capabilities & control register (0x154; ro) bit(s) initial value r/w description 15:0 0x000e ro id - pcie extended capability id. pcie extended capability id for the ari. 19:16 0x1 ro version - capability version. this field is a pci-sig defined version number that indicates the version of the current capability structure. must be 0x1 for this version of the specification. 31:20 0x160 ro next capability ptr. - next capability offset. this field contains the offset to the next pcie extended capability structure. the value of the 0x160 points to the iov structure. bit(s) r/w initial value description 0 ro 0b m - mfvc function groups capability. applicable only to function 0; must be 0b for all other functions. if 1b, indicates that the ari device supports function group level arbitration via its multi-function virtual channel (mfvc) capability structure. not supported in the 82576. 1 ro 0b a - acs function groups capability (a). applicable only to function 0; must be 0b for all other functions. if 1b, indicates that the ari device supports function group level granularity for acs p2p egress control via its acs capability structures. not supported in the 82576. 7:2 ro 0x0 reserved. 15:8 ro 0x1 (func 0) 0x0 (func 1) 1 1. if port 0 and port 1 are switched or function zero is a dummy function, this register should keep it?s attributes according t o the function number. if lan1 is disabled, then the value of this field in function zero should be zero. nfp - next function pointer. this field contains the pointer to the next physical function configuration space or 0x0000 if no other items exist in the linked list of functions. function 0 is the start of the link list of functions. 16 ro 0b m_en - mfvc function groups enable (m). applicable only for function 0; must be hardwired to 0b for all other functions. when set, the ari device must interpret entries in its function arbitration table as function group numbers rather than function numbers. not supported in the 82576. 17 ro 0b a_en - acs function groups enable (a). applicable only for function 0; must be hardwired to 0b for all other functions. when set, each function in the ari device must associate bits within its egress control vector with function group numbers rather than function numbers. not supported in the 82576. 19:18 ro 00b reserved. 22:20 ro 0x0 function group number (fgn). not supported in the 82576. 31:23 ro 0x0 reserved.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 718 9.6.4 iov capability structure this is the new structure used to support the iov capabilities reporting and control.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 719 9.6.4.1 pcie sr-iov header register (0x160; ro) 9.6.4.2 pcie sr-iov capabilities register (0x164; ro) 9.6.4.3 pcie sr-iov control register (0x168; rw) bit(s) r/w initial value description 15:0 ro 0x0010 pcie extended capability id. pcie extended capability id of the sr-iov capability. 19:16 ro 0x1 capability version. this field is a pci-sig defined version number that indicates the version of the current capability structure. must be 0x1 for this version of the specification. 31:20 ro 0x0 next capability offset. this field contains the offset to the next pcie extended capability structure or 0x000 if no other items that exist in the linked list of capabilities. bit(s) r/w initial value description 0 ro 0b vf migration capable. migration capable device running under migration capable mr-pcim. ro as 0b in the 82576. 20:1 ro 0x0 reserved. 31:21 ro 0x0 vf migration interrupt message number. indicates the msi/msi-x vector used for the interrupts. this field is undefined when the vf migration capable bit is cleared. bit(s) r/w initial value description 0 rw 0b vfe: vf enable/disable. vf enable manages the assignment of vfs to the associated pf. if vf enable is set, the vfs associated with the pf are accessible in the pcie fabric. when set, vfs respond to and may issue pci-express transactions following the rules for pci-express endpoint functions. if clear, vfs are disabled and not visible in the pci-express fabric; vfs shall respond to requests with ur and may not issue pcie transactions. setting vf enable after it has been previously been cleared shall result in the same vf state as if flr had been issued to the vf. 1 ro 0b vf me - vf migration enable. enables/disables vf migration support. 2 ro 0b vf mie - vf migration interrupt enable. enables/disables vf migration state change interrupt.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 720 9.6.4.4 pcie sr-iov max/total vfs register (0x16c) 3 rw 0b vf mse - memory space enable for virtual functions vf mse controls memory space enable for all vfs associated with this pf as with the memory space enable bit in a functions pci command register. the default value for this bit is 0b. when vf enable = 1b, virtual function memory space access is permitted only when vf mse is set. vfs must follow the same error reporting rules as defined in the base specification if an attempt is made to access a virtual functions memory space when vf enable is 1b and vf mse is 0b. note: virtual functions memory space cannot be accessed when the vf enable bit = 0b. thus, vf mse is a don't care when vf enable = 0b, however, software might choose to set vf mse after programming the vf barn registers, prior to setting vf enable to 1b. 4 rw (func 0) ros (func 1) 1 0b ari capable hierarchy. the device is permitted to locate vfs in function numbers 8 to 255 of the captured bus number. default value is 0b. this field is rw in the lowest numbered pf. other functions use the pf0 value as sticky. 15:5 ro 0x0 reserved. 16 ro 0b vfmis - vf migration event pending. indicates a vf migration in or migration out request has been issued by mr- pcim. to determine the cause of the event, software can scan the vf state array. not implemented in the 82576. 31:17 ro 0x0 reserved. 1. if the ports are switched, this field should keep it?s attributes according to the function number. bit(s) r/w initial value description 15:0 ro 0x8 initialvfs. indicates the number of vfs that are initially associated with the pf. if vf migration capable is clear, this field must contain the same value as totalvfs. a lower value of this field can be loaded from the iov control word in the eeprom. 31:16 ro 0x8 totalvfs. indicates the maximum number of vfs that could be associated with the pf. in the 82576, this is equal to initialvfs. bit(s) r/w initial value description
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 721 9.6.4.5 pcie sr-iov num vfs register (0x170; r/w) 9.6.4.6 pcie sr-iov vf rid mapping register (0x174; ro) see section 7.10.2.6 for details of the rid mapping. bit(s) r/w initial value description 15:0 r/w 0x0 numvfs. defines the number of vfs software has assigned to the pf. software sets numvfs as part of the process of creating vfs. numvfs vfs must be visible in the pcie fabric after both numvfs are set to a valid value and vf enable is set to 1b. visible in the pcie fabric means that the vf must respond to pcie transactions targeting the vf, following all other rules defined by this specification and the base specification. the results are undefined if numvfs are set to a value greater than totalvfs. numvfs can only be written while vf enable is clear. the numvfs field is ro when vf enable is set. 23:16 ro 0x0 (func 0) 0x1 (func 1) 1 1. even if port 0 and port 1 are switched or function zero is a dummy function, this register should keep it?s attributes accord ing to the function number. fdl - function dependency link. defines dependencies between physical functions allocation. the default behavior of the 82576 is not to define any such constraints. 31:24 ro 0x0 reserved. bit(s) r/w initial value description 15:0 ro 0x180 fvo. first vf offset defines the requestor id (rid) offset of the first vf that is associated with the pf that contains this capability structure. the first vfs 16-bit rid is calculated by adding the contents of this field to the rid of the pf containing this field. the content of this field is valid only when vf enable is set. if vf enable is 0b, the contents are undefined. if the ari enable bit is set, this field changes to 0x80. 31:16 ro 0x2 vfs. vf stride defines the requestor id (rid) offset from one vf to the next one for all vfs associated with the pf that contains this capability structure. the next vfs 16-bit rid is calculated by adding the contents of this field to the rid of the current vf. the content of this field is valid only when vf enable is set and numvfs are a non- zero. if vf enable is 0b or if numvfs are zero, the contents are undefined.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 722 9.6.4.7 pcie sr-iov vf device id register (0x178; ro) 9.6.4.8 pcie sr-iov supported page size register (0x17c; ro) bit(s) r/w initial value description 31:16 ro 0x10ca this field contain the device id that should be presented for every vf to the virtual machine software. the value of this field may be read from eeprom word 0x26 15:0 ro 0 reserved. bit(s) r/w initial value description 31:0 ro 0x553 supported page size. for pfs that support the stride-based bar mechanism, this field defines the supported page sizes. this pf supports a page size of 2^(n+12) if bit n is set. for example, if bit 0 is set, the ep supports 4 kb page sizes. endpoints are required to support 4 kb, 8 kb, 64 kb, 256 kb, 1 mb and 4 mb page sizes. all other page sizes are optional.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 723 9.6.4.9 pcie sr-iov system page size register (0x180; r/w) 9.6.4.10 pcie sr-iov bar 0 - low register (0x184; r/w) 9.6.4.11 pcie sr-iov bar 0 - high register (0x188; r/w) bit(s) r/w initial value description 31:0 r/w 0x1 page size. this field defines the page size the system uses to map the pf's and associated vfs' memory addresses. software must set the value of the system page size to one of the page sizes set in the supported page sizes field. as with supported page sizes, if bit n is set in system page size, the pf and its associated vfs are required to support a page size of 2^(n+12). for example, if bit 1 is set, the system is using an 8 kb page size. the results are undefined if more than one bit is set in system page size. the results are undefined if a bit is set in a system page size that is not set in supported page sizes. when system page size is set, the pf and associated vfs are required to align all bar resources on a system page size boundary. each bar size, including vf barn size (described in the sections that follow) must be aligned on a system page size boundary. each bar size, including vf barn size must be sized to consume a multiple of system page size bytes. all fields requiring page size alignment within a function must be aligned on a system page size boundary. vf enable must be set to 0b when system page size is set. the results are undefined if system page size is set when vf enable is set. bit(s) r/w initial value description 0 ro 0b mem. 0b = indicates memory space. 2:1 ro 10b mem type. indicates the address space size. 10b = 64-bit. bar bit sizes are set according to bit 2 in eeprom word 0x25. 3 ro 0b prefetch mem. 0b = non-prefetchable space. 1b = prefetchable space. this bars prefetchable bit is set according to bit 1 in eeprom word 0x25. 31:4 r/w 0x0 memory address space. which bits are r/w bits and which are read only to 0b depends on the memory mapping window size. the size is a maximum between 16 kb and the page size. bit(s) r/w initial value description 31:0 rw 0b bar0 - msb. msb part of bar 0.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 724 9.6.4.12 pcie sr-iov bar 2 register (0x18c; ro) 9.6.4.13 pcie sr-iov bar 3 - low register (0x190; r/w) 9.6.4.14 pcie sr-iov bar 3 - high register (0x194; r/w) 9.6.4.15 pcie sr-iov bar 5 register (0x198; ro) 9.6.4.16 pcie sr-iov vf migration state array offset register bit(s) r/w initial value description 31:0 ro 0b bar2. this bar is not used. bit(s) r/w initial value description 0 ro 0b mem. 0b = indicates memory space. 2:1 ro 10b mem type. indicates the address space size. 10b = 64-bit. bar bit sizes are set according to bit 2 in eeprom word 0x25. 3 ro 0b prefetch mem. 0b = non-prefetchable space. 1b = prefetchable space. this bar?s prefetchable bit is set according to bit 1 in eeprom word 0x25. 31:4 r/w 0b memory address space. which bits are r/w bits and which are read only to 0b depends on the memory mapping window size. the size is a maximum between 16 kb and the page size. bit(s) r/w initial value description 31:0 rw 0x0 bar3 - msb. msb part of bar 3. bit(s) r/w initial value description 31:0 ro 0x0 bar5. this bar is not used.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 725 (0x19c; ro) 9.7 virtual functions (vf) configuration space the configuration space reflected to each vf is a sparse version of the physical function configuration space. table 9-14 lists the behavior of each register in the vf configuration space. bit(s) r/w initial value description 2:0 ro 000b bir. indicates which pf bar contains the vf migration state array. not implemented in the 82576. 31:0 ro 0x0 offset, relative to the beginning of the bar of the start of the migration array. not implemented in the 82576. table 9-14. vf pcie configuration space section offset name vf behavior notes pci mandatory registers 0 vendor id ro - 0xffff 2 device id ro - 0xffff 4 command rw see section 9.7.1.1 for details 6 status per vf see section 9.7.1.2 for details 8 revisionid ro as pf 9 class code ro as pf c cache line size ro - 0 d latencytimer ro - 0 e header type ro - 0 f bist ro - 0 10 - 27 bars ro - 0 emulated by vmm 28 cardbus cis ro - 0 not used 2c sub vendor id ro as pf 2e sub system ro as pf 30 expansion rom ro - 0 emulated by vmm 34 cap pointer ro - 0x70 points to msi-x 3c int line ro - 0 3d int pin ro - 0 3e max lat/min gnt ro - 0
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 726 msi-x capability 70 msi-x header ro - 0xa011 points to pcie capability 72 msi-x message control per vf see section 9.7.2.1.1 74 msi-x table address ro 78 msi-x pba address ro pcie capability a0 pcie header ro - 0x0010 last capability a2 pcie capabilities ro - as pf 0x0002 a4 pcie dev cap ro - as pf a8 pcie dev ctrl rw ro as zero apart from flr - see section 9.7.2.2.1 aa pcie dev status per vf see section 9.7.2.2.2 ac pcie link cap ro - as pf b0 pcie link ctrl ro - 0x0 b2 pcie link status ro - 0x0 c4 pcie dev cap 2 ro - as pf c8 pcie dev ctrl 2 ro - 0x0 the timeout value and timeout disable of the pf are used for all vfs. d0 pcie link ctrl 2 ro - 0x0 d2 pcie link status 2 ro - 0x0 aer capability 100 aer - header ro - 0x15010001 points to ari structure 104 aer - uncorr status per vf see section 9.7.2.3.1 108 aer - uncorr mask ro - 0x0 10c aer - uncorr severity ro - 0x0 110 aer - corr status per vf see section 9.7.2.3.2 114 aer - corr mask ro - 0x0 118 aer - cap/ctrl per vf same structure as in pf 11c:128 aer - error log one log per vf same structure as in pf. ari capability 150 ari - header 0x0001000e last 154 ari - cap/ctrl ro - 0 table 9-14. vf pcie configuration space (continued) section offset name vf behavior notes
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 727 9.7.1 legacy header details 9.7.1.1 vf command register (0x4; rw) bit(s) r/w initial value description 0 ro 0b ioae - i/o access enable. ro as a zero field. 1 ro 0b mae - memory access enable. ro as a zero field. 2 rw 0b bme - bus master enable disabling this bit prevents the associated vf from issuing any memory or i/o requests. note that as msi/msi-x interrupt messages are in-band memory writes, disabling the bus master enable bit disables msi/msi-x interrupt messages as well. requests other than memory or i/o requests are not controlled by this bit. note: the state of active transactions is not specified when this bit is disabled after being enabled. the 82576 can choose how it behaves when this condition occurs. software cannot count on the 82576 retaining state and resuming without loss of data when the bit is re-enabled. transactions for a vf that has its bus master enable bit set must not be blocked by transactions for vfs that have their bus master enable bit cleared. 3 ro 0b scm - special cycle enable. hardwired to 0b. 4 ro 0b mwie - mwi enable. hardwired to 0b. 5 ro 0b pse - palette snoop enable. hardwired to 0b. 6 ro 0b per - parity error response. zero for vfs. 7 ro 0b wce - wait cycle enable. hardwired to 0b. 8 ro 0b serre - serr# enable. zero for vfs. 9 ro 0b fb2be - fast back-to-back enable. hardwired to 0b. 10 ro 0b intd - interrupt disable. hardwired to 0b. 15:11 ro 0x0 reserved.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 728 9.7.1.2 vf status register (0x6; rw) 9.7.2 vf legacy capabilities 9.7.2.1 vf msi-x capability the only register with a different layout than the pf for msi-x, is the control register. 9.7.2.1.1 vf msi-x control register (0x72; rw) bits r/w initial value description 2:0 ro 000b reserved. 3 ro 0b interrupt status. hardwired to 0b. 4 ro 1b new capabilities. indicates that the 82576 vfs implement extended capabilities. the 82576 vfs implement a capabilities list to indicate that it supports enhanced message signaled interrupts and pcie extensions. 5 ro 0b 66mhz capable. hardwired to 0b. 6 ro 0b reserved. 7 ro 0b fast back-to-back capable. hardwired to 0b. 8 r/w1c 0b mperr - data parity reported. 10:9 ro 00b devsel timing. hardwired to 0b. 11 r/w1c 0b sta - signaled target abort. 12 r/w1c 0b rta - received target abort. 13 r/w1c 0b rma - received master abort. 14 r/w1c 0b sserr - signaled system error. 15 r/w1c 0b dserr - detected parity error. bits r/w initial value description 10:0 ro 0x002 1 1. default value is read from i/o virtualization (iov) control eeprom word. ts - table size. 13:11 ro 000b reserved. 14 rw 0b mask - function mask. 15 rw 0b en - msi-x enable.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 729 9.7.2.2 vf pcie capability registers the device control and device status registers have some fields that are specific per vf. 9.7.2.2.1 vf device control register (0xa8; rw) 9.7.2.2.2 vf device status register (0xaa; rw1c) bits r/w default description 0 ro 0b correctable error reporting enable. zero for vfs. 1 ro 0b non-fatal error reporting enable. zero for vfs. 2 ro 0b fatal error reporting enable. zero for vfs. 3 ro 0b unsupported request reporting enable. zero for vfs. 4 ro 0b enable relaxed ordering. zero for vfs. 7:5 ro 0b max payload size. zero for vfs. 8 ro 0b extended tag field enable. not implemented in the 82576. 9 ro 0b phantom functions enable. not implemented in the 82576. 10 ro 0b auxiliary power pm enable. zero for vfs. 11 ro 0b enable no snoop. zero for vfs. 14:12 ro 000b max read request size. zero for vfs. 15 rw 0b initiate function level reset. specific to each vf. bits r/w default description 0 rw1c 0b correctable detected. indicates status of correctable error detection. 1 rw1c 0b non-fatal error detected. indicates status of non-fatal error detection. 2 rw1c 0b fatal error detected. indicates status of fatal error detection.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 730 9.7.2.3 vf advanced error reporting registers the following registers in the aer capability have a different behavior in a vf function. 9.7.2.3.1 vf uncorrectable error status register (0x104; r/w1cs) the correctable error status register reports error status of individual correctable error sources on a pcie device. when an individual error status bit is set to 1b, it indicates that a particular error occurred; software can clear an error status by writing a 1b to the respective bit. see the table below. 3 rw1c 0b unsupported request detected. indicates that the 82576 received an unsupported request. this field is identical in all functions. the 82576 cannot distinguish which function caused an error. 4 ro 0b aux power detected. zero for vfs. 5 ro 0b transaction pending. specific per vf. when set, indicates that a particular function (pf or vf) has issued non-posted requests that have not been completed. a function reports this bit cleared only when all completions for any outstanding non-posted requests have been received. 15:6 ro 0x00 reserved. bit location r/w default value description 3:0 ro 0000b reserved. 4 ro 0b data link protocol error status. 5 ro 0b surprise down error status (optional). 11:6 ro 0x0 reserved. 12 r/w1cs 0b poisoned tlp status. 13 ro 0b flow control protocol error status. 14 r/w1cs 0b completion timeout status. 15 r/w1cs 0b completer abort status. 16 r/w1cs 0b unexpected completion status. 17 ro 0b receiver overflow status. 18 ro 0b malformed tlp status. 19 ro 0b ecrc error status. 20 r/w1cs 0b unsupported request error status. when caused by a function that claims a tlp. 21 ro 0b acs violation status. 31:21 ro 0x0 reserved.
pcie programming interface ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 731 9.7.2.3.2 vf correctable error status register (0x110; r/w1cs) bit location r/w default value description 0 ro 0b receiver error status. 5:1 ro 0x0 reserved. 6 ro 0b bad tlp status. 7 ro 0b bad dllp status. 8 ro 0b replay_num rollover status. 11:9 ro 000b reserved. 12 ro 0b replay timer timeout status. 13 r/w1cs 0b advisory non-fatal error status. 31:14 ro 0x0 reserved.
intel ? 82576eb gbe controller ? pcie programming interface intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 732 note: this page intentionally left blank.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 733 10.0 system manageability network management is an important requirement in today's networked computer environment. software-based management applications provide the ability to administer systems while the operating system is functioning in a normal power state (not in a pre-boot state or powered-down state). the intel? system management bus (smbus) interface and the network controller sideband interface (nc-si; type c) fill the management void that exists when the operating system is not running or fully functional. this is accomplished by providing mechanisms by which manageability network traffic can be routed to and from a management controller (mc). this chapter describes the supported management interfaces and hardware configurations for platform system management. it describes the interfaces to an external mc, the partitioning of platform manageability among system components, and the functionality provided by the 82576 in each platform configuration. 10.1 pass-through (pt) functionality pass-through (pt) is the term used when referring to the process of sending and receiving ethernet traffic over the sideband interface. the 82576 has the ability to route ethernet traffic to the host operating system as well as the ability to send ethernet traffic over the sideband interface to an external mc. see figure 10-1 . the sideband interface provides a mechanism by which the 82576 can be shared between the host and the mc. by providing this sideband interface, the mc can communicate with the lan without requiring a dedicated ethernet controller. the 82576 supports two sideband interfaces: ? smbus ?nc-si figure 10-1. sideband interface
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 734 the usable bandwidth for either direction is up to 400 kb/s when using smbus and 100 mb/s for the nc-si interface. only one mode of sideband can be active at any given time. the configuration is done using an nvm setting. 10.2 sideband packet routing when an ethernet packet reaches the 82576, it is examined and compared to a number of configurable filters. these filters are configurable by the mc and include, but not limited to, filtering on: ? mac address ? ip address ? udp/ip ports ? vlan tags ? ethertype if the incoming packet matches any of the configured filters, it is passed to the mc. otherwise it is not passed. 10.3 components of the sideband interface there are two components to a sideband interface: ? physical layer ? logical layer the mc and the 82576 must be in alignment for both components. an example issue: the nc-si physical interface is based on the rmii interface, but there are differences between the devices at the physical level and the protocol layer is completely different. 10.3.1 physical layer this is the electrical connection between the 82576 and mc. 10.3.1.1 smbus the smbus physical layer is defined by the smbus specification. the interface is made up of two connections: data and clock. there is also an optional third connection: the alert line. this line is used by the 82576 to notify the mc that there is data available for reading. refer to the smbus specification. 10.3.1.2 nc-si the 82576 uses the dmft standard sideband interface. this interface consists of 6 lines for transmission and reception of ethernet packets and two optional lines for arbitration among more than one physical network controller. the physical layer of nc-si is very similar to the rmii interface, although not an exact duplicate. refer to the nc-si specification.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 735 10.3.2 logical layer 10.3.2.1 smbus the protocol layer for smbus consists of commands the mc issues to configure filtering for 82576 management traffic and the reading and writing of ethernet frames over the smbus interface. there is no industry standard protocol for sideband traffic over smbus. the protocol layer for smbus on the 82576 is intel proprietary. 10.3.2.2 nc-si the dmtf also defines the protocol layer for the nc-si interface. nc-si compliant devices are required to implement a minimum set of commands. the specification also provides a mechanism for vendors to add additional capabilities through the use of oem commands. intel oem nc-si commands for the 82576 are discussed in this document. for information on base nc-si commands, see the nc-si specification. 10.4 packet filtering since both the host operating system and mc use the 82576 to send and receive ethernet traffic, there needs to be a mechanism by which incoming ethernet packets can be identified as those that should be sent to the mc rather than the host operating system. there are two different types of filtering available. the first is filtering based upon the mac address. with this filtering, the mc has at least one dedicated mac address and incoming ethernet traffic with the matching mac address(es) are passed to the mc. this is the simplest filtering mechanism to utilize and it allows an mc to receive all types traffic (including, but not limited to, ipmi, nfs, http etc). the other mechanism available utilizes a highly configurable mechanism by which packets can be filtered using a wide range of parameters. using this method, an mc can share a mac address (and ip address, if desired) with the host os and receive only specific ethernet traffic. this method is useful if the mc is only interested in specific traffic, such as ipmi packets. 10.4.1 manageability receive filtering this section describes the manageability receive packet filtering flow. the description applies to the 82576 lan ports. packet reception by the 82576 can generate one of the following results: ? discarded ? sent to host memory ? sent to the external mc ? sent to both the mc and host memory in default mode, every packet directed to the mc is not directed to the host. the mc can configure the 82576 to direct certain manageability packets to host memory by setting the en_mng2host bit in the manc register. the mc then needs to configure the 82576 to send manageability packets to the host (according to their type) by setting the corresponding bits in the manageability to host (manc2h) register.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 736 an example of packets that might be necessary to send to the mc and the host operating system might be arp requests. if the mc configures the manageability filters to send arp requests to the mc and does not also configure the settings to also send them to the host, the host operating system never receives arp requests. there are two modes of receive manageability filtering: 1. receive all ? all received packets are routed to the mc. 2. receive filtering ? only certain types of packets are routed to the mc. the mc controls the types of packets that it receives by programming receive manageability filters. the following filters are accessible to the mc: not all filtering capabilities are available on both the nc-si and smbus interfaces. all filters are reset only on internal power on reset. register filters that enable filters or functionality are also reset by firmware. these registers can be loaded from the nvm following a reset. the high-level structure of manageability filtering is done using three steps. the first 2 steps are shared with host filtering: 1. packets are filtered by l2 criteria (mac address and unicast/multicast/broadcast). 2. packets are filtered by vlan if a vlan tag is present. 3. packets are filtered by the manageability filters (port, ip, flex, etc.). some general rules apply: ? fragmented packets are passed to manageability but not parsed beyond the ip header. ? packets with l2 errors (crc, alignment, etc.) are not forwarded to the mc. if the mc uses a dedicated mac address/vlan tag, it should take care not to use l3/l4 decision filtering. otherwise, all the packets with the manageability mac address/vlan tag filtered out at l3/l4 are forwarded to the host. filters functionality when reset? filters enable general configuration of the manageability filters internal power on reset manageability to host enables routing of manageability packets to host internal power on reset manageability decision filters [7:0] configuration of manageability decision filters internal power on reset mac address [3:0] four unicast mac manageability addresses internal power on reset vlan filters [7:0] eight vlan tag values internal power on reset udp/tcp port filters [15:0] 16 destination port values internal power on reset flexible 128 bytes tco filters [3:0] length values for four flex tco filters internal power on reset ipv4 and ipv6 address filters [3:0] ip address for manageability filtering internal power on reset l2 ethertype filters[3:0] 4 l2 ethertype values internal power on reset
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 737 the manageability filtering stage combines checks done at previous stages with additional l3/l4 checks to make the decision to route a packet to the mc. the following sections describe manageability filtering done at layers l3/l4, followed by the final filtering rules. filtering rules are created by mc programming decision filters. 10.4.2 ethertype filters manageability l2 ethertype filters allow filtering of received packets based on the layer 2 ethertype field. the l2 type field of incoming packets is compared against the ethertype filters programmed in the manageability ethertype filter (metf; up to 4 filters); the result is incorporated into decision filters. each manageability ethertype filter can be configured as pass (positive) or reject (negative) using a polarity bit. in order for the reverse polarity mode to be effective and block certain type of packets, the ethertype filter should be part of all the enabled decision filters. examples for usage of l2 ethertype filters are: ? block routing of packets with the nc-si ethertype from being routed to the management controller. the nc-si ethertype is used communication between the management controller on the nc-si link and 82576. packets coming from the network are not expected to carry this ethertype and such packets are blocked to prevent attacks on the management controller. ? determine the destination of 802.1x control packets. the 802.1x protocol is executed at different times in either the management controller or by the host. l2 ethertype filters are used to route these packets to the proper agent. 10.4.3 l2 layer filtering a packet passes successfully through l2 filtering if any of the following conditions are met: 1. it is a unicast packet and promiscuous unicast filtering is enabled. 2. it is a unicast packet and it matches one of the unicast mac filters (host or manageability). 3. it is a multicast packet and promiscuous multicast filtering is enabled. 4. it is a multicast packet and it matches one of the multicast filters. 5. it is a broadcast packet. see also: section 7.1.2, l2 packet filtering . 10.4.4 l3/l4 filtering the manageability filtering stage combines checks done at previous stages with additional l3/l4 checks to make a the decision on whether to route a packet to the mc. the following sections describe the manageability filtering done at layers l3/l4 and final filtering rules. 10.4.4.1 arp filtering arp filtering ? the 82576 supports filtering of arp request packets (initiated externally) and arp responses (to requests initiated by the mc or host). in smbus mode, there are arp filters that can be enabled. arp filtering is not specifically available when using nc-si. however, the general filtering mechanism can be utilized to filter incoming arp traffic.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 738 10.4.4.2 neighbor discovery filtering the 82576 supports filtering of neighbor solicitation packets (type 135). neighbor solicitation uses the ipv6 destination address filters defined in the ip6at registers (all enabled ipv6 addresses are matched for neighbor solicitation). in smbus mode, there is specific neighborhood discovery that can be enabled. the nc-si interface does not have a filter for this. however, the general filtering mechanism can be utilized to filter this type of traffic. 10.4.4.3 rmcp filtering the 82576 supports filtering by fixed destination port numbers, port 0x26f and port 0x298. these ports are iana reserved for ipmi. in smbus mode, there are filters that can be enabled for these ports. when using nc-si, they are not specifically available. however, the general filtering mechanism can be utilized to filter incoming arp traffic. 10.4.4.4 flexible port filtering the 82576 implements 16 flex destination port filters. the 82576 directs packets whose l4 destination port matches to the mc. the mc must insure that only valid entries are enabled in the decision filters. 10.4.4.5 flexible 128 byte filter the 82576 provides four flex tco filters. each filter looks for a pattern match within the 1st 128 bytes of the packet. the mc must ensure that only valid entries are enabled in decision filters. flex filters are temporarily disabled when read from or written to by the host. any packet received during a read or write operation is dropped. filter operation resumes once the read or write access completes. 10.4.4.5.1 flexible filter structure each filter is composed of the following fields: 1. flexible filter length ? this field indicates the number of bytes in the packet header that should be inspected. the field also indicates the minimal length of packets inspected by the filter. packet below that length will not be inspected. valid values for this field are: 8*n, where n=1?8. 2. data ? this is a set of up to 128 bytes comprised of values that header bytes of packets are tested against. 3. mask ? this is a set of 128 bits corresponding to the 128 data bytes that indicate for each corresponding byte if is tested against its corresponding byte. the general filter is 128 bytes that the mc configures; all of these bytes may not be needed or used for the filtering, so the mask is used to indicate which of the 128 bytes are used for the filter. each filter tests the first 128 bytes (or less) of a packet, where not all bytes must necessarily be tested. 10.4.4.5.2 tco filter programming
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 739 programming each filter is done using the following commands (nc-si or smbus) in a sequential manner: 1. filter mask and length ? this command configures the following fields: a. mask ? a set of 16 bytes containing the 128 bits of the mask. bit 0 of the first byte corresponds to the first byte on the wire. b. length ? a 1-byte field indicating the length. 2. filter data ? the filter data is divided into groups of bytes. as described below: each group of bytes need to be configured using a separate command, where the group number is given as a parameter. the command has the following parameters: a. group number ? a 1-byte field indicating the current group addressed b. data bytes ? up to 30 bytes of test-bytes for the current group 10.4.4.6 ip address filtering the 82576 supports filtering by ip address using ipv4 and ipv6 address filters. these are dedicated to manageability. 10.4.4.7 checksum filtering if bit manc.en_xsum_filter is set, the 82576 directs packets to the mc only if they pass l3/l4 checksum (if they exist) in addition to matching other filters previously described. enabling the xsum filter when using the smbus interface is accomplished by setting the enable xsum filtering to manageability bit within the manageability control (manc) register. this is done using the update management receive filter parameters command. see section 10.5.10.1.5.1 . to enable the xsum filtering when using nc-si, use the enable checksum offloading command. see section 10.6.2.14 . 10.4.5 configuring manageability filters there are a number of pre-defined filters that are available for the mc to enable, such as arps and ipmi ports 298h 26fh. these are generally enabled by setting the appropriate bit within the manc register using specific commands. for more advanced filtering needs, the mc has the ability to configure a number of configurable filters. it is a two-step process to use these filters. they must first be configured and then enabled. group test bytes 0x0 0-29 0x1 30-59 0x2 60-89 0x3 90-119 0x4 120-127
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 740 10.4.5.1 manageability decision filters (mdef) and extended manageability decision filters (mdef_ext) manageability decision filters are a set of eight filters, each with the same structure. the filtering rule for each decision filter is programmed by the mc and defines which of the l2, vlan, and manageability filters participate in decision making. any packet that passes at least one rule is directed to manageability and possibly to the host. with the 82576, packets can also be filtered by ethertype. this is part of the extended manageability decision filters (mdef_ext). the inputs to each decision filter are: ? packet passed a valid management l2 unicast address filter. ? packet is a broadcast packet. ? packet has a vlan header and it passed a valid manageability vlan filter. ? packet matched one of the valid ipv4 or ipv6 manageability address filters. ? packet is a multicast packet. ? packet passed arp filtering (request or response). ? packet passed neighbor solicitation filtering. ? packet passed 0x298/0x26f port filter. ? packet passed a valid flex port filter. ? packet passed a valid flex tco filter. ? packet passed or failed an l2 ethertype filter. the structure of each decision filter is shown in figure 10-2 . a boxed number indicates that the input is conditioned by a mask bit defined in the mdef register and mdef_ext register for this rule. decision filter rules are as follows: ? at least one bit must be set in a register. if all bits are cleared (mdef/mdef_ext = 0x0000), then the decision filter is disabled and ignored. ? all enabled and filters must match for the decision filter to match. an and filter not enabled in the mdef/mdef_ext registers is ignored. ? if no or filter is enabled in the register, the or filters are ignored in the decision (the filter might still match). ? if one or more or filter is enabled in the register, then at least one of the enabled or filters must match for the decision filter to match.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 741 a decision filter (for any of the 8 filters) defines which of the above inputs is enabled as part of a filtering rule. the mc programs two 32-bit registers per rule (mdef[7:0] & mdef_ext[7:0]). a set bit enables its corresponding filter to participate in the filtering decision. figure 10-2. manageability decision filters
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 742 in default mode, packets that are directed to the mc are not directed to host memory. 10.4.5.2 management to host filter when a packet passes the filters for manageability, they are only sent to the mc and not to the host. there are times when it is desirable for incoming packets to be sent to both places.this commonly occurs with broadcast and multicast packets. a common example is arp requests; both the mc and the host usually have a need to receive arp requests. to provide a mechanism allowing packets to be passed to both the mc and the host, the management to host (manc2h) filter was created. this filter is enabled using a two-step process. first the filter is configured and then enabled. table 10-1. assignment of decision filter bits(mdef) filter and/or input mask bits in mdef[7:0] l2 unicast address and 0 broadcast and 1 manageability vlan and 2 ip address and 3 l2 unicast address or 4 broadcast or 5 multicast and 6 arp request 1 1. ip address checking on arp packets is configured using the advanced receive enable command. or 7 arp response 1 or 8 neighbor discovery or 9 port 0x298 or 10 port 0x26f or 11 flex port 15:0 or 27:12 flex tco 3:0 or 31:28 table 10-2. assignment of decision filter bits (mdef_ext) filter and/or input mask bits in mdef_ext[7:0] l2 ethertype and 3:0 reserved -- 7:4 l2 ethertype or 11:8 reserved -- 31:12
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 743 the manc2h filter can be configured to pass multicast and broadcast packets to the host by enabling specific bits (6 & 7 respectively) within the manc2h register. in addition, any traffic matching any of the first 5 configurable filters (see section 10.4.5.1 ) can be used as filters to pass traffic to the host. when using the smbus interface, the mc enables these filters by issuing the update management receive filter parameters command (see section 10.5.10.1.5.1 ) with the parameter of 0x0a and then enabling the manc2h bit within the manc register with the update management receive filter parameters command with the parameter 0x01. the manc2h is also configurable when using nc-si using the set intel filters ? manageability to host command (see section 10.6.2.6.3 ). as the nc-si interface is designed to be used with a dedicated mac address, the default behavior is for the 82576 to enable broadcast and multicast packets for manc2h. 10.4.6 possible configurations this section describes ways of using management filters. actual usage may vary. 10.4.6.1 dedicated mac packet filtering ? select one of the eight rules for dedicated mac filtering. ? set bit 0 of the decision rule to enforce mac address filtering. ? set other bits to qualify which packets are allowed to pass through. for example: ? set bit 2 to qualify with manageability vlan. ? set bit 3 to qualify with a match to an ip address. ? set any l3/l4 bits (30:7) to qualify with any of a set of l3/l4 filters. bits description default 0 decision filter 0 determines if packets that have passed decision filter 0 are also forwarded to the host operating system. 1 decision filter 1 determines if packets that have passed decision filter 1 are also forwarded to the host operating system. 2 decision filter 2 determines if packets that have passed decision filter 2 are also forwarded to the host operating system. 3 decision filter 3 determines if packets that have passed decision filter 3 are also forwarded to the host operating system. 4 decision filter 4 determines if packets that have passed decision filter 4 are also forwarded to the host operating system. 5 unicast and mixed determines if broadcast packets are also forwarded to the host operating system. 6 global multicast determines if unicast packets are also forwarded to the host operating system. 7 broadcast determines if multicast packets are also forwarded to the host operating system. 31: 8 reserved reserved
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 744 10.4.6.2 broadcast packet filtering ? select one of the eight rules for broadcast filtering. ? set bit 1 of the decision rule to enforce broadcast filtering. ? set other bits to qualify which broadcast packets are allowed to pass through. for example: ? set bit 2 to qualify with manageability vlan. ? set bit 3 to qualify with a match to an ip address. ? set any l3/l4 bits (30:7) to qualify with any of a set of l3/l4 filters. 10.4.6.3 vlan packet filtering ? select one of the eight rules for vlan filtering. ? set bit two of the decision rule to enforce vlan filtering. ? set other bits to qualify which vlan packets are allowed to pass through. for example: ? set any l3/l4 bits (30:7) to qualify with any of a set of l3/l4 filters. ipv6 filtering is done using the following ipv6-specific filters: ? ip unicast filtering ? requires filtering for link local address and a global address. filtering setup might depend on whether or not the mac address is shared with the host or dedicated to manageability: ? dedicated mac address (for example, dynamic address allocation with dhcp does not support multiple ip addresses for one mac address). in this case, filtering can be done at l2 using two dedicated unicast mac filters. ? shared mac address (for example, static address allocation sharing addresses with host). in this case, filtering needs to be done at l3, requiring two ipv6 address filters, one per address. ? a neighbor discovery filter ? the 82576 supports ipv6 neighbor discovery protocol. since the protocol relies on multicast packets, the 82576 supports filtering of these packets. ipv6 multicast addresses are translated into corresponding ethernet multicast addresses in the form of 33-33-xx- xx-xx-xx, where the last 32 bits of address are taken from the last 32 bits of the ipv6 multicast address. as a result, two direct mac filters can be used to filter ipv6 solicited-node multicast packets as well as ipv6 all node multicast packets. 10.4.6.4 receive filtering with shared ip when using the smbus interface, it is possible to share the host mac and ip address with the mc. this functionality is not available when using nc-si. when the mc shares the mac and ip address with the host, receive filtering is based on identifying specific flows through port allocation. the following setting might be used: ? select one of the eight rules. ? set a manageability dedicated mac filter to the host mac address and set bit 0 in the mng_ filter_rule register. ? if vlan is used for management, load one or more management vlan filters and set bit 2 in the mng_ filter_rule register. arp filter/neighbor discovery filter is enabled when the mc is responsible for handling the arp protocol. set bit 7 or bit 8 in the mng_ filter_rule register for this functionality.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 745 the mc can determine the mac address of the host by issuing the get system mac address command (available for both the smbus and nc-si interfaces). determining the ip address being used by the host is beyond the scope of this document. 10.4.7 determining manageability mac address if the mc wishes to use a dedicated mac address or configure the automatic arp response mechanism (only available in smbus mode), it may be beneficial for the mc to be able to determine the mac address used by the host. both the nc-si and smbus interfaces provide an intel oem command to read the system mac address. a possible use for this is that the mac address programmed at manufacturing time does not increment by one each time, but rather by two. in this way, the mc can read the system mac address and add one to it and be guaranteed of a unique mac address. 10.5 smbus pass-through interface smbus is the system management bus defined by intel. it is used in personal computers and servers for low-speed system management communications. this section describes how the smbus interface operates in pass-through mode. 10.5.1 general the smbus sideband interface includes standard smbus commands used for assigning a slave address and gathering device information as well as intel proprietary commands used specifically for the pass- through interface. 10.5.2 pass-through capabilities this section details manageability capabilities the 82576 provides while in smbus mode. pass-through traffic is carried by the sideband interface as described in section 10.1 . these services are not available in nc-si mode. when operating in smbus mode, in addition to exposing a communication channel to the lan for the mc, the 82576 provides the following manageability services to the mc: ? arp handling ? the 82576 can be programmed to auto-arp replying for arp request packets to reduce the traffic over the mc interconnect. ? teaming and fail-over ? the 82576 can be configured to one of several teaming and fail-over configurations: ? no-teaming ? the 82576 dual lan ports act independently of each other. as a result, no fail- over is supported. the mc is responsible for teaming and fail-over. ? teaming ? the 82576 can be configured to provide fail-over capabilities, such that manageability traffic is routed to an active port if any of the ports fail. several modes of operation are supported. ? default configuration of filters by eeprom - when working in smbus mode, the default values of the manageability receive filters can be set according to the pt lan and flex tco eeprom structures.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 746 10.5.3 pass-through multi-port modes pass-through configurations depend on the way lan ports are configured. if the lan ports are configured as two different channels (non-teaming mode), then the 82576 is presented on the smbus manageability link as two different devices (for example, via two different smbus addresses on which each device is connected to a different lan port). in this mode (the same as in the lan channels), there is no logical connection between the two devices. in this mode, the fail-over between the two lan ports is done by the mc (by sending/receiving packets through different devices). the status report to the mc, arp handling, dhcp, and other pass-through functionality are unique for each port and configured by the mc. when the lan ports are configured to work as one lan channel (teaming mode), the 82576 presents itself on the smbus as one device (one smbus address). in this mode, the external mc is not aware that there are two lan ports. the 82576 decides how to route the packet that it receives from the lan according to the fail-over algorithm. the status report to the mc and other pass-through configuration are common to both ports. 10.5.4 automatic ethernet arp operation automatic ethernet arp parameters are loaded from the nvm when the 82576 is powered up or configured through the sideband management interface. the following parameters should be configured in order to enable arp operation: ? arp auto-reply enabled ? arp ip address (to filter arp packets) ? arp mac addresses (for arp responses) these are all configurable over the sideband interface using the advanced version of the receive enable command. when an arp request packet is received and arp auto-reply is enabled, the 82576 checks the targeted ip address (after the packet has passed l2 checks and arp checks). if the targeted ip matches the ip configuration for the 82576, it replies with an arp response. the 82576 responds to arp request targeted to the arp ip address with the configured arp mac address. in case that there is no match, the 82576 silently discards the packets. if the 82576 is not configured to do auto-arp response, it can be configured to forward the arp packets to the mc (which can respond to arp requests). when the external mc uses the same ip and mac address of the os, the arp operation should be coordinated with the host operating system. note: if sharing the mac and ip with the host operating system is possible, the 82576 provides the ability to read the stem mac address, allowing the mc to share the mac address. there is no mechanism however provided by the 82576 to read the ip address. the host os (or an agent within) and mc must coordinate the sharing of ip addresses. 10.5.4.1 arp packet formats
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 747 table 10-3. arp request packet offset # of bytes field value (in hex) action 0 6 destination address compare 6 6 source address stored 12 8 possible llc/snap header stored 12 4 possible vlan tag stored 12 2 type 0806 compare 14 2 hw type 0001 compare 16 2 protocol type 0800 compare 18 1 hardware size 06 compare 19 1 protocol address length 04 compare 20 2 operation 0001 compare 22 6 sender hw address - stored 28 4 sender ip address - stored 32 6 target hw address - ignore 38 4 target ip address arp ip address compare table 10-4. arp response packet offset # of bytes field value 0 6 destination address arp request source address 6 6 source address programmed from eeprom or mc 12 8 possible llc/snap header from arp request 12 4 possible vlan tag from arp request 12 2 type 0x0806 14 2 hw type 0x0001 16 2 protocol type 0x0800 18 1 hardware size 0x06 19 1 protocol address length 0x04 20 2 operation 0x0002 22 6 sender hw address programmed from eeprom or mc
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 748 10.5.5 smbus transactions this section gives a brief overview of the smbus protocol. following is an example for a format of a typical smbus transaction. the top row of the table identifies the bit length of the field in a decimal bit count. the middle row (bordered) identifies the name of the fields used in the transaction. the last row appears only with some transactions, and lists the value expected for the corresponding field. this value can be either hexadecimal or binary. the smbus controller is a master for some transactions and a slave for others. the differences are identified in this document. shorthand field names are listed in table 10-6 and are fully defined in the smbus specification. 28 4 sender ip address programmed from eeprom or mc 32 6 target hw address arp request sender hw address 38 4 target ip address arp request sender ip address table 10-5. gratuitous arp packet offset # of bytes field value 0 6 destination address broadcast address 6 6 source address 12 2 type 0x0806 14 2 hw type 0x0001 16 2 protocol type 0x0800 18 1 hardware size 0x06 19 1 protocol address length 0x04 20 2 operation 0x0001 22 6 sender hw address 28 4 sender ip address 32 6 target hw address 38 4 target ip address 17 1181811 s slave address w r a command a pec a p 1100 001 0 0 0000 0010 0 [data dependent] 0 table 10-4. arp response packet (continued) offset (continued # of bytes field value
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 749 10.5.5.1 smbus addressing smbus addresses to which the 82576 responds depend on the lan mode (teaming/non-teaming). if the lan is in teaming mode (fail-over), the 82576 is presented over the smbus as one device and has one smbus address. while in non-teaming mode, the smbus is presented as two smbus devices on the smbus (two addresses). in dual-address mode, all pass-through functionality is duplicated on the smbus address, where each smbus address is connected to a different lan port. note that it is not permitted to configure both ports to the same smbus address. when a lan function is disabled, the corresponding smbus address is not presented to the mc (see section 10.5.11.1 ). the smbus addressing mode is defined through the smbus addressing mode bit in the eeprom. smbus addresses are set in smbus eeprom address 0 and smbus address 1. note that if single-address mode is set, only the smbus address 0 field is valid. smbus addresses (enabled from the nvm) can be re-assigned using the smbus arp protocol. in addition to the smbus address values, all parameters of the smbus (smbus channel selection, address mode, and address enable) can be set only through nvm configuration. note that the nvm is read at the 82576?s power up and resets. all smbus addresses should be in network byte order (nbo); msb first. 10.5.5.2 smbus arp functionality the 82576 supports the smbus arp protocol as defined in the smbus 2.0 specification. the 82576 is a persistent slave address device so its smbus address is valid after power-up and loaded from the nvm. the 82576 supports all smbus arp commands defined in the smbus specification both general and directed. smbus arp capability can be disabled through the nvm. 10.5.5.3 smbus arp flow smbus arp flow is based on the status of two flags: ? av (address valid): this flag is set when the 82576 has a valid smbus address. ? ar (address resolved): this flag is set when the 82576 smbus address is resolved (smbus address was assigned by the smbus arp process). table 10-6. shorthand field names field name definition s smbus start symbol p smbus stop symbol pec packet error code a ack (acknowledge) n nack (not acknowledge) rd read operation (read value = 1b) wr write operation (write value = 0b)
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 750 these flags are internal 82576 flags and are not exposed to external smbus devices. since the 82576 is a persistent smbus address (psa) device, the av flag is always set, while the ar flag is cleared after power up until the smbus arp process completes. since av is always set, the 82576 always has a valid smbus address. when the smbus master needs to start an smbus arp process, it resets (in terms of arp functionality) all devices on smbus by issuing either prepare to arp or reset device commands. when the 82576 accepts one of these commands, it clears its ar flag (if set from previous smbus arp process), but not its av flag (the current smbus address remains valid until the end of the smbus arp process). clearing the ar flag means that the 82576 responds to smbus arp transactions that are issued by the master. the smbus master issues a get udid command (general or directed) to identify the devices on the smbus. the 82576 always responds to the directed command and to the general command only if its ar flag is not set. after the get udid, the master assigns the 82576 smbus address by issuing an assign address command. the 82576 checks whether the udid matches its own udid and if it matches, it switches its smbus address to the address assigned by the command (byte 17). after accepting the assign address command, the ar flag is set and from this point (as long as the ar flag is set), the 82576 does not respond to the get udid general command. note that all other commands are processed even if the ar flag is set. the 82576 stores the smbus address that was assigned in the smbus arp process in the nvm, so at the next power up, it returns to its assigned smbus address.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 751 figure 10-3 shows the 82576 smbus arp flow. figure 10-3. smbus arp flow
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 752 10.5.5.4 smbus arp udid content the udid provides a mechanism to isolate each device for the purpose of address assignment. each device has a unique identifier. the 128-bit number is comprised of the following fields: where: device capabilities: dynamic and persistent address, pec support bit: version/revision: udid version 1, silicon revision: 1 byte 1 byte 2 bytes 2 bytes 2 bytes 2 bytes 2 bytes 4 bytes device capabilitie s version/ revision vendor id device id interface subsystem vendor id subsystem device id vendor specific id see notes that follow see notes that follow 0x8086 0x10aa 0x0004 0x0000 0x0000 see notes that follow msb lsb vendor id: the device manufacturer?s id as assigned by the sbs implementers? forum or the pci sig. constant value: 0x8086 device id: the device id as assigned by the device manufacturer (identified by the vendor id field). constant value: 0x10aa interface: identifies the protocol layer interfaces supported over the smbus connection by the device. in this case, smbus version 2.0 constant value: 0x0004 subsystem fields: these fields are not supported and return zeros. 76543210 address type reserved (0) reserved (0) reserved (0) reserved (0) reserved (0) pec supported 0b 1b 0b 0b 0b 0b 0b 0b msb lsb 76543210 reserved (0) reserved (0) udid version silicon revision id 0b 0b 001b see the following table msb lsb
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 753 silicon revision id: vendor specific id: four lsb bytes of the device ethernet mac address. the device ethernet address is taken from the nvm. note that in the 82576 there are two mac addresses (one for each port). bit 0 of the port 1 mac address has the inverted value of bit 0 from the eeprom. 10.5.5.5 smbus arp in dual/single mode the 82576 operates in either single smbus address mode or in dual smbus address mode. the modes determine smbus arp behavior. while operating in single mode, the 82576 presents itself on the smbus as one device and only responds to smbus arp as one device. in this case, the 82576's smbus address is smbus address 0 as defined in the eeprom smbus arp address word. the 82576 has only one ar and av flag. the vendor id and the mac address of the lan's port are taken from the port 0 address. in dual mode, the 82576 responds as two smbus devices having two sets of ar/av flags (one for each port). the 82576 responds twice to the smbus arp master, once each for each port. both smbus addresses are taken from the smbus arp address word of the eeprom. note that the unique device identifier (udid) is different for the two ports in the version id field (which represents the mac address and is different for the two ports). it is recommended that the 82576 first respond as port 0, and only when an address is assigned, then start responding as port 1 to the get udid command. 10.5.5.6 concurrent smbus transactions in single-address mode, concurrent smbus transactions (receive, transmit and configuration read/ write) are allowed without limitation. transmit fragments can be sent between receive fragments and configuration read/write commands can issue between receive and transmit fragments. in dual-address mode, the same rules apply to concurrent traffic between the two addresses supported by the 82576. packets can only be transmitted from one port/device at a given time. as a result, the mc must finish sent packets (send a last fragment command) from one port before starting the transmission for the other port. silicon version revision id a0 000b a1 001b a2 010b 1 byte 1 byte 1 byte 1 byte mac address, byte 3 mac address, byte 2 mac address, byte 1 mac address, byte 0 msb lsb
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 754 10.5.6 smbus notification methods the 82576 supports three methods of notifying the mc that it has information that needs to be read by the mc: ? smbus alert ? asynchronous notify ? direct receive the notification method used by the 82576 can be configured from the smbus using the receive enable command. this default method is set by the nvm in the pass-through init field. the following events cause the 82576 to send a notification event to the mc: ? receiving a lan packet that is designated to the mc. ? receiving a request status command from the mc initiates a status response. ? status change has occurred and the 82576 is configured to notify the external mc at one of the status changes. ? change in any in the status data 1 bits of the read status command. there can be cases where the mc is hung and not responding to the smbus notification. the 82576 has a time-out value (defined in the nvm) to avoid hanging while waiting for the notification response. if the mc does not respond until the time out expires, the notification is de-asserted and all pending data is silently discarded. note that the smbus notification time-out value can only be set in the nvm. the mc cannot modify this value. 10.5.6.1 smbus alert and alert response method the smbus alert# (smbalert_n) signal is an additional smbus signal that acts as an asynchronous interrupt signal to an external smbus master. the 82576 asserts this signal each time it has a message that it needs the mc to read and if the chosen notification method is the smbus alert method. note that the smbus alert method is an open-drain signal which means that other devices besides the 82576 can be connected on the same alert pin. as a result, the mc needs a mechanism to distinguish between the alert sources. the mc can respond to the alert by issuing an ara cycle command to detect the alert source device. the 82576 responds to the ara cycle with its own smbus slave address (if it was the smbus alert source) and de-asserts the alert when the ara cycle is completes. following the ara cycle, the mc issues a read command to retrieve the 82576 message. some mcs do not implement the ara cycle transaction. these mcs respond to an alert by issuing a read command to the 82576 (0xc0/0xd0 or 0xde). the 82576 always responds to a read command, even if it is not the source of the notification. the default response is a status transaction. if the 82576 is the source of the smbus alert, it replies the read transaction and then de-asserts the alert after the command byte of the read transaction. note: in smbus alert mode, the smbalert_n pin is used for notification. in dual-address mode, both devices generate alerts on events that are independent of each other. the ara cycle is an smbus receive byte transaction to smbus address 0001-100b. note that the ara transaction does not support pec. the ara transaction format is as follows:
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 755 10.5.6.2 asynchronous notify method when configured using the asynchronous notify method, the 82576 acts as a smbus master and notifies the mc by issuing a modified form of the write word transaction. the asynchronous notify transaction smbus address and data payload is configured using the receive enable command or using the nvm defaults. note that the asynchronous notify is not protected by a pec byte. the target address and data byte low/high is taken from the receive enable command or nvm configuration. 10.5.6.3 direct receive method if configured, the 82576 has the capability to send a message it needs to transfer to the external mc as a master over the smbus instead of alerting the mc and waiting for it to read the message. the message format follows. note that the command that is used is the same command that is used by the external mc in the block read command. the opcode that the 82576 puts in the data is also the same as it put in the block read command of the same functionality. the rules for the f and l flags (bits) are also the same as in the block read command. 1 7 1 1 8 111 s alert response address rd a slave device address a p 0001 100 1 0 manageability slave smbus address 01 17 11 7 11 s target address wr a sending device address a ... mc slave address 0 0 mng slave smbus address 0 0 81 8 11 data byte low a data byte high a p interface 0 alert value 0 171111 6 1 s target address w r a f l command a ... mc slave address 0 0 fir st fl ag last flag receive tco command 01 0000b 0
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 756 10.5.7 receive tco flow the 82576 is used as a channel for receiving packets from the network link and passing them to the external mc. the mc configures the 82576 to pass these specific packets to the mc. once a full packet is received from the link and identified as a manageability packet that should be transferred to the mc, the 82576 starts the receive tco flow to the mc. the 82576 uses the smbus notification method to notify the mc that it has data to deliver. since the packet size might be larger than the maximum smbus fragment size, the packet is divided into fragments, where the 82576 uses the maximum fragment size allowed in each fragment (configured via the nvm). the last fragment of the packet transfer is always the status of the packet. as a result, the packet is transferred in at least two fragments. the data of the packet is transferred as part of the receive tco lan packet transaction. when smbus alert is selected as the mc notification method, the 82576 notifies the mc on each fragment of a multi-fragment packet. when asynchronous notify is selected as the mc notification method, the 82576 notifies the mc only on the first fragment of a received packet. it is the mc's responsibility to read the full packet including all the fragments. any timeout on the smbus notification results in discarding the entire packet. any nack by the mc causes the fragment to be re-transmitted to the mc on the next receive packet command. the maximum size of the received packet is limited by the 82576 hardware to 1536 bytes. packets larger then 1536 bytes are silently discarded. any packet smaller than 1536 bytes is processed. 10.5.8 transmit tco flow the 82576 is used as the channel for transmitting packets from the external mc to the network link. the network packet is transferred from the mc over the smbus and then, when fully received by the 82576, is transmitted over the network link. in dual-address mode, each smbus address is connected to a different lan port. when a packet is received during a smbus transaction using smbus address #0, it is transmitted to the network using lan port #0; it is transmitted through lan port #1 if received on smbus address #1. in single address mode, the transmitted port is chosen according to the fail-over algorithm. the 82576 supports packets up to an ethernet packet length of 1536 bytes. since smbus transactions can only be up to 240 bytes in length, packets might need to be transferred over the smbus in more than one fragment. this is achieved using the f and l bits in the command number of the transmit tco packet block write command. when the f bit is set, it is the first fragment of the packet. when the l bit is set, it is the last fragment of the packet. when both bits are set, the entire packet is in one fragment. the packet is sent over the network link only after all its fragments are received correctly over the smbus. the maximum smbus fragment size is defined within the nvm and cannot be changed by the mc. 81 8 1 1 8 11 byte count a data byte 1 a ... a data byte n a p n0 0 0 0
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 757 if the packet sent by the mc is larger than 1536 bytes, than the packet is silently discarded. the minimum packet length defined by the 802.3 spec is 64 bytes. the 82576 pads packets that are less than 64 bytes to meet the specification requirements (there is no need for the external mc to pad packets less than 64 bytes). if the packet sent by the mc is larger than 1536 bytes, the 82576 silently discards the packet. the 82576 calculates the l2 crc on the transmitted packet and adds its four bytes at the end of the packet. any other packet field (such as xsum) must be calculated and inserted by the mc (the 82576 does not change any field in the transmitted packet, other than adding padding and crc bytes). if the network link is down when the 82576 has received the last fragment of the packet from the mc, it silently discards the packet. note that any link down event during the transfer of any packet over the smbus does not stop the operation since the 82576 waits for the last fragment to end to see whether the network link is up again. 10.5.8.1 transmit errors in sequence handling once a packet is transferred over the smbus from the mc to the 82576, the f and l flags should follow specific rules. the f flag defines the first fragment of the packet; the l flag that the transaction contains the last fragment of the packet. table 10-7 lists the different flag options in transmit packet transactions. 10.5.8.2 tco command aborted flow the 82576 indicates to the mc an error or an abort condition by setting the tco abort bit in the general status. the 82576 might also be configured to send a notification to the mc (see section 10.5.10.1.3.3 ). following is a list of possible error and abort conditions: ? any error in the smbus protocol (nack, smbus timeouts, etc.). ? any error in compatibility between required protocols to specific functionality (for example, rx enable command with a byte count not equal to 1/14, as defined in the command specification). ? if the 82576 does not have space to store the transmitted packet from the mc (in its internal buffer space) before sending it to the link, the packet is discarded and the external mc is notified via the abort bit. ? error in the f / l bit sequence during multi-fragment transactions. table 10-7. flag options during transmit packet transactions previous current action/notes last first accept both. last not first error for the current transaction. current transaction is discarded and an abort status is asserted. not last first error in previous transaction. previous transaction (until previous first) is discarded. current packet is processed. no abort status is asserted. not last not first process the current transaction. note: since every other block write command in tco protocol has both f and l flags off, they cause flushing any pending transmit fragments that were previously received. when running the tco transmit flow, no other block write transactions are allowed in between the fragments.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 758 ? an internal reset to the 82576's firmware. 10.5.9 smbus arp transactions all smbus arp transactions include the pec byte. 10.5.9.1 prepare to arp this command clears the address resolved flag (set to false). it does not affect the status or validity of the dynamic smbus address and is used to inform all devices that the arp master is starting the arp process: 10.5.9.2 reset device (general) this command clears the address resolved flag (set to false). it does not affect the status or validity of the dynamic smbus address. 10.5.9.3 reset device (directed) the command field is nacked if bits 7:1 do not match the current smbus address. this command clears the address resolved flag (set to false) and does not affect the status or validity of the dynamic smbus address. 10.5.9.4 assign address this command assigns smbus address. the address and command bytes are always acknowledged. the transaction is aborted (nacked) immediately if any of the udid bytes is different from 82576 udid bytes. if successful, the manageability system internally updates the smbus address. this command also sets the address resolved flag (set to true). 1 7 1181 8 11 s slave address w r a command a pec a p 1100 001 0 0 0000 0001 0 [data dependent value] 0 1 7 1181 8 11 s slave address w r a command a pec a p 1100 001 0 0 0000 0010 0 [data dependent value] 0 171181 8 11 s slave address w r a command a pec a p 1100 001 0 0 targeted slave address | 0 0 [data dependent value] 0
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 759 10.5.9.5 get udid (general and directed) the general get udid smbus transaction supports a constant command value of 0x03 and, if directed, supports a dynamic command value equal to the dynamic smbus address. if the smbus address has been resolved ( address resolved flag set to true), the manageability system does not acknowledge (nack) this transaction. if it?s a general command, the manageability system always acknowledges (acks) as a directed transaction. this command does not affect the status or validity of the dynamic smbus address or the address resolved flag. 17 11 8 1 8 1 s slave address w r a command a byte count a ? ? ? 1100 001 0 0 0000 0100 0 0001 0001 0 8 1818181 data 1 a data 2 a data 3 a data 4 a ? ? ? udid byte 15 (msb) 0 udid byte 14 0 udid byte 13 0 udid byte 12 0 8 181 8 1 8 1 data 5 a data 6 a data 7 a data 8 a ? ? ? udid byte 11 0 udid byte 10 0 udid byte 9 0 udid byte 8 0 8 1818 1 data 9 a data 10 a data 11 a ? ? ? udid byte 7 0 udid byte 6 0 udid byte 5 0 81818181 data 12 a data 13 a data 14 a data 15 a ? ? ? udid byte 4 0 udid byte 3 0 udid byte 2 0 udid byte 1 0 8181811 data 16 a data 17 a pec a p udid byte 0 (lsb) 0 assigned address 0 [data dependent value] 0 s slave address wr a command a s ? ? ? 1100 001 0 0 see below 0
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 760 the get udid command depends on whether or not this is a directed or general command. the general get udid smbus transaction supports a constant command value of 0x03. the directed get udid smbus transaction supports a dynamic command value equal to the dynamic smbus address with the lsb bit set. note: bit 0 (lsb) of data byte 17 is always 1b. 711 8 1 slave address r d a byte count a ? ? ? 1100 001 1 0 0001 0001 0 8 1818181 data 1 a data 2 a data 3 a data 4 a ? ? ? udid byte 15 (msb) 0 udid byte 14 0 udid byte 13 0 udid byte 12 0 81818181 data 5 a data 6 a data 7 a data 8 a ? ? ? udid byte 11 0 udid byte 10 0 udid byte 9 0 udid byte 8 0 818181 data 9 a data 10 a data 11 a ? ? ? udid byte 7 0 udid byte 6 0 udid byte 5 0 8 1818 181 data 12 a data 13 a data 14 a data 15 a ? ? ? udid byte 4 0 udid byte 3 0 udid byte 2 0 udid byte 1 0 8181 8 11 data 16 a data 17 a pec ~? p udid byte 0 (lsb) 0 device slave address 0 [data dependent value] 1
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 761 10.5.10 smbus pass-through transactions this section details commands (both read and write) that the 82576 smbus interface supports for pass- through. 10.5.10.1 write smbus transactions this section details the commands that the mc can send to the 82576 over the smbus interface. the write smbus transactions table lists supported transactions. 10.5.10.1.1 transmit packet command the transmit packet command behavior is detailed in chapter 3.0 . the transmit packet fragments have the following format. the payload length is limited to the maximum payload length set in the eeprom. if the overall packet length is bigger than 1536 bytes, the packet is silently discarded. 10.5.10.1.2 request status command table 10-8. write smb transactions tco command transaction command fragmentation section transmit packet block write first: 0x84 middle: 0x04 last: 0x44 multiple 10.5.10.1.1 transmit packet block write single: 0xc4 single 10.5.10.1.1 request status block write single: 0xdd single 10.5.10.1.2 receive enable block write single: 0xca single 10.5.10.1.3 force tco block write single: 0xcf single 10.5.10.1.4 management control block write single: 0xc1 single 10.5.10.1.5 update mng rcv filter parameters block write single: 0xcc single 10.5.10.1.5.1 update macsec parameters block write single: 0xc9 single 10.5.10.1.6 function command byte count data 1 ? data n transmit first fragment 0x84 n packet data msb ? packet data lsb transmit middle fragment 0x04 transmit last fragment 0x44 transmit single fragment 0xc4
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 762 an external mc can initiate a request to read 82576 manageability status by sending a request status command. when received, the 82576 initiates a notification to an external mc when status is ready. after this, an external controller will be able to read status by issuing a read status command (see section 10.5.10.2.2 ). the format is as follows: 10.5.10.1.3 receive enable command the receive enable command is a single fragment command used to configure the 82576. this command has two formats: short, 1-byte legacy format (providing backward compatibility with previous components) and long, 14-byte advanced format (allowing greater configuration capabilities). the receive enable command format is as follows: function command byte count data 1 request status 0xdd 1 0 function cmd byte count data 1 data 2 ? data 7 data 8 ? data 11 data 12 data 13 data 14 legacy receive enable 0xc a 1 receiv e contro l byte -?--?- - - - advance d receive enable 14 (0x0 e) mac add r msb mac add r lsb ip add r msb ip addr lsb mc smbu s addr i/f data byte aler t valu e byte field bit(s) description rcv_en 0 receive tco enable. 0b: disable receive tco packets. 1b: enable receive tco packets. setting this bit enables all manageability receive filtering operations. enabling specific filters is done via the nvm or through special configuration commands. note: when the rcv_en bit is cleared, all receive tco functionality is disabled, not just the packets that are directed to the mc (also auto arp packets). rcv_all 1 receive all enable. 0b: disable receiving all packets. 1b: enable receiving all packets. forwards all packets received over the wire that passed l2 filtering to the external mc. this flag has no effect if bit 0 (enable tco packets) is disabled. en_sta 2 enable status reporting. 0b: disable status reporting. 1b: enable status reporting.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 763 10.5.10.1.3.1 management mac address (data bytes 7:2) ignored if the cbdm bit is not set. this mac address is used to configure the dedicated mac address. in addition, it is used in the arp response packet when the en_arp_res bit is set. this mac address is also used when cbdm bit is set in subsequent short versions of this command. 10.5.10.1.3.2 management ip address (data bytes 11:8) this ip address is used to filter arp request packets. 10.5.10.1.3.3 asynchronous notification smbus address (data byte 12) this address is used for the asynchronous notification smbus transaction and for direct receive. 10.5.10.1.3.4 interface data (data byte 13) interface data byte used in asynchronous notification. en_arp_re s 3 enable arp response. 0b: disable the 82576 arp response. the 82576 treats arp packets as any other packet, for example, packet is forwarded to the mc if it passed other (non-arp) filtering. 1b: enable the 82576 arp response. the 82576 automatically responds to all received arp requests that match its ip address. note that setting this bit does not change the rx filtering settings. appropriate rx filtering to enable arp request packets to reach the mc should be set by the mc or by the eeprom. the mc ip address is provided as part of the receive enable message (bytes 8:11). if a short version of the command is used, the 82576 uses ip address configured in the most recent long version of the command in which the en_arp_res bit was set. if no such previous long command exists, then the 82576 uses the ip address configured in the eeprom as arp response ipv4 address in the pass-through lan configuration structure. if the cbdm bit is set, the 82576 uses the mc dedicated mac address in arp response packets. if the cbdm bit is not set, the mc uses the host mac address. nm 5:4 notification method. define the notification method the 82576 uses. 00b: smbus alert. 01b: asynchronous notify. 10b: direct receive. 11b: not supported. reserved 6 reserved. must be set to 1b. cbdm 7 configure the mc dedicated mac address. note: this bit should be 0b when the rcv_en bit (bit 0) is not set. 0b: the 82576 shares the mac address for mng traffic with the host mac address, which is specified in nvm words 0x0-0x2. 1b: the 82576 uses the mc dedicated mac address as a filter for incoming receive packets. the mc mac address is set in bytes 2-7 in this command. if a short version of the command is used, the 82576 uses the mac address configured in the most recent long version of the command in which the cbdm bit was set. when the dedicated mac address feature is activated, the 82576 uses the following registers to filter in all the traffic addressed to the mc mac.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 764 10.5.10.1.3.5 alert value data (data byte 14) alert value data byte used in asynchronous notification. 10.5.10.1.4 force tco command this command causes the 82576 to perform a tco reset, if force tco reset is enabled in the nvm. the force tco reset clears the data path (rx/tx) of the 82576 to enable the mc to transmit/receive packets through the 82576. note that in single-address mode, both ports are reset when the command is issued. in dual-address mode, force tco reset is asserted only to the port related to the smbus address the command. this command should only be used when the mc is unable to transmit receive and suspects that the 82576 is inoperable. the command also causes the lan device driver to unload. it is recommended to perform a system restart to resume normal operation. the 82576 considers the force tco command as an indication that the operating system is hung and clears the drv_load flag. the force tco reset command format is as follows: where tco mode is: 10.5.10.1.5 management control this command is used to set generic manageability parameters. the parameters list is shown in table 10-9 . the command is 0xc1 stating that it is a management control command. the first data byte is the parameter number and the data that follows (length and content) are parameter specific as shown in management control command parameters/content. note: if the parameter that the mc sets is not supported by the 82576. the 82576 does not nack the transaction. after the transaction ends, the 82576 discards the data and asserts a transaction abort status. function command byte count data 1 force tco reset 0xcf 1 tco mode field bit(s) description do_tco_rst 0 perform tco reset. 0b: do nothing. 1b: perform tco reset. reserved 1:1 reserved (set to 0). reset_mgmt 2 reset manageability; re-load manageability eeprom words. 0b = do nothing 1b = issue firmware reset to manageability. setting this bit generates a one-time firmware reset. following the reset, management data from eeprom is loaded. reserved 7:3 reserved (set to 0x00). note: before initiating a firmware reset command, one should disable tco receive via receive enable command -- setting rcv_en to 0 -- and wait for 200 milliseconds before initiating firmware reset command. in addition, the mc should not transmit during this period.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 765 the management control command format is as follows: 10.5.10.1.5.1 update management receive filter parameters this command is used to set the manageability receive filters parameters. the command is 0xcc. the first data byte is the parameter number and the data that follows (length and content) are parameter specific as listed in management rcv filter parameters. if the parameter that the mc sets is not supported by the 82576, then the 82576 does not nack the transaction. after the transaction ends, the 82576 discards the data and asserts a transaction abort status. the update management rcv receive filter parameters command format is as follows: table 10-10 lists the different parameters and their content. function command byte count data 1 data 2 ? data n management control 0xc1 n parameter number parameter dependent table 10-9. management control command parameters/content parameter # parameter data keep phy link up 0x0 0 a single byte parameter: data 2: bit 0: set to indicate that the phy link for this port should be kept up throughout system resets. this is useful when the server is reset and the mc needs to keep connectivity for a manageability session. bit [7:1] reserved. 0b: disabled. 1b: enabled. function command byte count data 1 data 2 ? data n update manageability filter parameters 0xcc n parameter number parameter dependent table 10-10. management receive filter parameters parameter number parameter data filters enables 0x1 defines the generic filters configuration. the structure of this parameter is four bytes as the manageability control (manc) register. note: the general filter enable is in the receive enable command that enables receive filtering. management-to-host configuration 0xa this parameter defines which of the packet types identified as manageability packets in the receive path are directed to the host memory. data 5:2 = manc2h register bits. data 2 is the msb
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 766 fail-over configuration 0xb fail-over register configuration. the bytes of this parameter are loaded into the fail-over configuration register. see section 10.5.11.3 for more information. data 2:5 = fail-over configuration register. data 2 is the msb flex filter 0 enable mask and length 0x10 flex filter 0 mask. data 17:2 = mask. bit 0 in data 2 is the first bit of the mask. data 19:18 = reserved. should be set to 00b. date 20 = flexible filter length. flex filter 0 data 0x11 data 2 ? group of flex filter?s bytes: 0x0 = bytes 0-29 0x1 = bytes 30-59 0x2 = bytes 60-89 0x3 = bytes 90-119 0x4 = bytes 120-127 data 3:32 = flex filter data bytes. data 3 is lsb. group's length is not a mandatory 30 bytes; it might vary according to filter's length and must not be padded by zeros. flex filter 1 enable mask and length 0x20 same as parameter 0x10 but for filter 1. flex filter 1 data 0x21 same as parameter 0x11 but for filter 1. flex filter 2 enable mask and length 0x30 same as parameter 0x10 but for filter 2. flex filter 2 data 0x31 same as parameter 0x11 but for filter 2. flex filter 3 enable mask and length 0x40 same as parameter 0x10 but for filter 3. flex filter 3 data 0x41 same as parameter 0x11 but for filter 3. filters valid 0x60 four bytes to determine which of the 82576 filter registers contain valid data. loaded into the mfval. should be updated after the contents of a filter register are updated. data 2: msb of mfval. ... data 5: lsb of mfval. decision filters 0x61 five bytes are required to load the manageability decision filters (mdef). data 2: decision filter number. data 3: msb of mdef register for this decision filter. ... data 6: lsb of mdef register for this decision filter. vlan filters 0x62 three bytes are required to load the vlan tag filters. data 2: vlan filter number. data 3: msb of vlan filter. data 4: lsb of vlan filter. table 10-10. management receive filter parameters
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 767 10.5.10.1.6 update macsec parameters this command is used to set the manageability macsec parameters. the parameters list is shown in the table below. the first data byte is the parameter number and the data after-words (length and content) are parameter specific as shown in the table. this is the format of the update macsec parameters command: flex port filters 0x63 three bytes are required to load the manageability flex port filters. data 2: flex port filter number. data 3: msb of flex port filter. data 4: lsb of flex port filter. ipv4 filters 0x64 five bytes are required to load the ipv4 address filter. data 2: ipv4 address filter number (3:0). data 3: lsb of ipv4 address filter. ? data 6: msb of ipv4 address filter. ipv6 filters 0x65 17 bytes are required to load the ipv6 address filter. data 2 ? ipv6 address filter number (3:0). data 3 ? lsb of ipv6 address filter. ? data 18 ? msb of ipv6 address filter. mac filters 0x66 seven bytes are required to load the mac address filters. data 2 ? mac address filters pair number (3:0). data 3 ? msb of mac address. ? data 8: lsb of mac address. ethertype filters 0x67 5 bytes to load ethertype filters (metf) data 2 - metf filter index (valid values are 0 and 1. 2 indexes 2 and 3 are valid if macsec is not in use) ? data 3 ? msb of metf ... data 6 ? lsb of metf extended decision filter 0x68 9 bytes to load the extended decision filters (mdef_ext & mdef) data 2 ? mdef filter index (valid values are 0..6) data 3 ? msb of mdef_ext (decisionfilter1) .... data 6 ? lsb of mdef_ext (decisionfilter1) data 7 ? msb of mdef (decisionfilter0) .... data 10 ? lsb of mdef (decisionfilter0) the command shall overwrite any previously stored value table 10-10. management receive filter parameters
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 768 the table below shows the different parameters and their contents . function command byte count data 1 data 2 ? data n update macsec filter parameters 0xc9 n parameter number parameter dependent parameter # parameter data transfer macsec ownership to bmc 0x1 0 data 2: host control: bit 0 ? reserved bit 1 ? allow host traffic (0b ? blocked, 1b ? allowed) bit 2...31 ? reserved. transfer macsec ownership to host 0x1 1 no data needed initialize macsec rx 0x1 2 data 2: rx port identifier (msb) data 3: rx port identifier (lsb) rx port identifier ? the port number by which the 82576 will identify rx packets. it is recommended that the mc uses 0x0 as the port identifier. note: the mc should use the same port identifier when performing the key-exchange. data 4: rx sci (msb) ? data 9: rx sci (lsb) rx sci ? a 6 bytes unique identifier for the macsec tx ca. it is recommended that the mc uses its mac address value for this field. initialize macsec tx 0x1 3 data 2: tx port identifier (msb) data 3: tx port identifier (lsb) tx port identifier must be set to zero. data 4: tx sci (msb) ? data 9: tx sci (lsb) tx sci - a 6 bytes unique identifier for the macsec tx ca. it is recommended that the bmc uses its mac address value for this field. data 10: reserved data 11: reserved data 12: packet number threshold (msb) ? data 15: packet number threshold (lsb) pn threshold - when a new key is programmed, the packet number is reset to 0x1. with each tx packet, the packet number increments by 1 and is inserted to the packet (to avoid replay attacks). the pn threshold value is the 3 msbytes of the tx packet number after which a "key exchange required" aen will be sent to the bmc. example: a pn threshold of 0x123456 means that when the pn reaches 0x123456ff a notification will be sent if the pn threshold is less than 0x100, the pn threshold will be set to a default of 0x4000. data 16: tx control - see table 10-11
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 769 10.5.10.2 read smbus transactions this section details the pass-through read transactions that the mc can send to the 82576 over smbus. if an smbus quick read command is received, it is handled as a request status command (see section 10.5.10.1.2, request status command . set macsec rx key 0x1 4 data 2: reserved data 3: rx sa an data 4: rx macsec key (msb) ? data 19: rx macsec key (lsb) rx sa an: the association number to be used with this key. rx macsec key ? the 128 bits (16 bytes) key to be used for rx set macsec tx key 0x1 5 data 2: reserved data 3: tx sa an data 4: tx macsec key (msb) ? data 19: tx macsec key (lsb) tx sa an: the association number to be used with this key. tx macsec key ? the 128 bits (16 bytes) key to be used for tx enable macsec network tx encryption 0x1 6 data 2: mode: 0: authentication only. 1: encryption and authentication disable macsec network tx encryption 0x1 7 no data needed enable macsec network rx decryption 0x1 8 no data needed disable macsec network rx decryption 0x1 9 no data needed table 10-11. tx control bit description 0..4 reserved 5 always include sci in tx: 0b ? do not include sci in tx packets 1b ? include sci in tx packets 6..7 reserved
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 770 smbus read transactions lists the different smbus read transactions supported by the 82576. all the read transactions are compatible with smbus read block protocol format. 0xc0 or 0xd0 commands are used for more than one payload. if mc issues these read commands, and the 82576 has no pending data to transfer, it always returns as default opcode 0xdd with the 82576 status and does not nack the transaction. 10.5.10.2.1 receive tco lan packet transaction the mc uses this command to read packets received on the lan and its status. when the 82576 has a packet to deliver to the mc, it asserts the smbus notification for the mc to read the data (or direct receive). upon receiving notification of the arrival of a lan receive packet, the mc begins issuing a receive tco packet command using the block read protocol. a packet can be transmitted to the mc in at least two fragments (at least one for the packet data and one for the packet status). as a result, mc should follow the f and l bit of the op-code. the op-code can have these values: ? 0x90 ? first fragment ? 0x10 ? middle fragment ? when the opcode is 0x50, this indicates the last fragment of the packet, which contains packet status. if a notification timeout is defined (in the nvm) and the mc does not finish reading the whole packet within the timeout period, since the packet has arrived, the packet is silently discarded. following is the receive tco packet format and the data format returned from the 82576. table 10-12. smbus read transactions tco command transaction command opcode fragments section receive tco packet block read 0xd0 or 0xc0 first: 0x90 middle: 0x10 last 1 : 0x50 1. the last fragment of the receive tco packet is the packet status. multiple 10.5.10.2.1 read status block read 0xd0 or 0xc0 or 0xde single: 0xdd single 10.5.10.2.1 get system mac address block read 0xd4 single: 0xd4 single 10.5.10.2.3 read management parameters block read 0xd1 single: 0xd1 single 10.5.10.2.4 read management rcv filter parameters block read 0xcd single: 0xcd single 10.5.10.2.5 read receive enable configuration block read 0xda single: 0xda single 10.5.10.2.6 read macsec parameters block read 0xd9 single: 0xd9 single 10.5.10.2.7 function command receive tco packet 0xc0 or 0xd0
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 771 10.5.10.2.1.1 receive tco lan status payload transaction this transaction is the last transaction that the 82576 issues when a packet received from the lan is transferred to the mc. the transaction contains the status of the received packet. the format of the status transaction is as follows: the status is 16 bytes where byte 0 (bits 7:0) is set in data 2 of the status and byte 15 in data 17 of the status. table 10-13 lists the content of the status data. function byte count data 1 (op-code) data 2 ? data n receive tco first fragment n 0x90 packet data byte ? packet data byte receive tco middle fragment 0x10 receive tco last fragment 17 (0x11) 0x50 packet data byte see section 10.5.10.2.1. 1 function byte count data 1 (op- code) data 2 ? data 17 (status data) receive tco long status 17 (0x11) 0x50 see below table 10-13. tco lan packet status data field bit(s) description lan# 21 indicates the source port of the packet reserved 20 reserved vp 19 vlan stripped ?insertion of vlan tag is needed. vext 18 additional vlan present in packet reserved 17:15 reserved secp 14 security offload done on packet (valid only in macsec mode). reserved 14 reserved reserved 13:12 reserved crc stripped 11 insertion of crc is needed. reserved 10:6 reserved udpv 5 udp checksum valid reserved 4:3 reserved ipcs 2 ipv4 checksum calculated on packet l4i 1 l4 (tcp/udp) checksum calculated on packet udpcs 0 udp checksum calculated on packet
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 772 table 10-14. error status info field bit(s) description rxe 4 rx data error ipe 3 ipv4 checksum error l4e 2 l4 (tcp/udp) checksum error sece 1:0 security error ? see table 10-15 table 10-15. security errors code error type 00b no error 01b no sa match 10b replay error 11b bad signature table 10-16. packet type bit index bit 11 = 0b bit 11 = 1b (l2 packet) 12 vlan packet indication 11 packet matched one of the etqf filters. 10 macsec ? macsec encapsulation 1 1. the macsec bit is set only if the packet forwarded to the host contains a macsec header and tailer. if the macsec encapsulati on was processed and removed by hardware, this bit is not set. macsec ? macsec encapsulation 9 ipsec ah ? ipsec encapsulation reserved 8 ipsec esp ? ipsec encapsulation 7 nfs ? nfs header present 6 sctp ? sctp header present 5 udp ? udp header present 4 tcp ? tcp header present 3 ipv6e ? ipv6 header includes extensions 2 ipv6 ? ipv6 header present ethertype ? etqf register index that matches the packet. special types might be defined for 1588, 802.1x, or any other requested type. 1 ipv4e ? ipv4 header includes extensions 0 ipv4 ? ipv4 header present
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 773 10.5.10.2.2 read status command the mc should use this command after receiving a notification from the 82576 (such as smbus alert). the 82576 also sends a notification to the mc in either of the following two cases: ? the mc asserts a request for reading the status. ? the 82576 detects a change in one of the status data 1 bits (and was set to send status to the mc on status change) in the receive enable command. note: commands 0xc0/0xd0 are for backward compatibility and can be used for other payloads. the 82576 defines these commands in the opcode as well as which payload this transaction is. when the 0xde command is set, the 82576 always returns opcode 0xdd with the 82576 status. the mc reads the event causing the notification, using the read status command as follows. the 82576 response to one of the commands (0xc0 or 0xd0) in a given time as defined in the smbus notification timeout and flags word in the nvm. table 10-17. mng status name bits description decision filter match 42:35 set when there is a match to one of the decision filters ipv4/ipv6 match 34 set when there is an ipv4 match or ipv6 match. this bit is valid only if the bit 30 (ip match bit) or bit 4 (arp match bit) are set. ip address match 33 set when there is a match to any of the ip address filters ip address index 32:31 set when there is a match to the ip filter number. (ipv4 or ipv6) flex tco filter match 30 set when the mng packet matches one of the mng flex tco flex tco filter index 29:27 l4 port match 26 set when there is a match to any of the udp / tcp port filters l4 port filter index 25:19 indicate the flex filter number unicast address match 18 set when there is a match to any of the 4 unicast mac addresses. unicast address index 17:15 indicates which of the 4 unicast mac addresses match the packet. valid only if the unicast address match is set. mng vlan address match 14 set when the mng packet matches one of the mng vlan filters pass mng vlan filter index 13:11 reserved 10:8 reserved pass arp req / arp resp 7 set when the mng packet is an arp response/request packet pass mng neighbor 6 set when the mng packet is a neighbor discovery packet. pass mng broadcast 5 set when the mng packet is a broadcast packet pass rmcp 0x0298 4 set when the udp/tcp port of the mng packet is 0x298 pass rmcp 0x026f 3 set when the udp/tcp port of the mng packet is 0x26f manageability ethertype filter passed 2 indicates that one of the metf filters matched manageability ethertype filter index 1:0 indicates which of the metf filters matched
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 774 table 10-18 lists the status data byte 1 parameters. status data byte 2 is used by the mc to indicate whether the lan device driver is alive and running. the lan device driver valid indication is a bit set by the lan device driver during initialization; the bit is cleared when the lan device driver enters a dx state or is cleared by the hardware on a pci reset. bits 2 and 1 indicate that the lan device driver is stuck. bit 2 indicates whether the interrupt line of the lan function is asserted. bit 1 indicates whether the lan device driver dealt with the interrupt line before the last read status cycle. table 10-19 lists status data byte 2. function command read status 0xc0 or 0xd0 or 0xde function byte count data 1 (op-code) data 2 (status data 1) data 3 (status data 2) receive tco partial status 3 0xdd see below table 10-18. status data byte 1 bit name description 7 reserved reserved. 6 tco command aborted 1b = a tco command abort event occurred since the last read status cycle. 0b = a tco command abort event did not occur since the last read status cycle. 5 link status indication 0b = lan link down. 1b = lan link up 1 . 1. when the 82576 is operating in teaming mode, and presented as one smbus device, the link indication is 0b only when both link s (on both ports) are down. if one of the lans is disabled, its link is considered to be down. 4 phy link forced up contains the value of the phy_link_up bit. when set, indicates that the phy link is configured to keep the link up. 3 initialization indication 0b = an nvm reload event has not occurred since the last read status cycle. 1b = an nvm reload event has occurred since the last read status cycle 2 . 2. this indication is asserted when the 82576 manageability block reloads the nvm and its internal database is updated to the nv m default values. this is an indication that the external mc should reconfigure the 82576, if other values other than the nvm def ault should be configured. 2 reserved reserved. 1:0 power state 00b = dr state. 01b = d0u state. 10b = d0 state. 11b = d3 state 3 . 3. in single-address mode, the 82576 reports the highest power-state modes in both devices. the ?d? state is marked in this orde r: d0, d0u, dr, and d3.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 775 table 10-20 lists the possible values of bits 2 and 1 and what the mc can assume from the bits: mc reads should consider the time it takes for the lan device driver to deal with the interrupt (in ? s). note that excessive reads by the mc can give false indications. 10.5.10.2.3 get system mac address the get system mac address returns the system mac address over to the smbus. this command is a single-fragment read block transaction that returns the following data: table 10-19. status data byte 2 bit name description 7 reserved reserved 6 reserved reserved 5 reserved reserved 4 macsec indication if set - indicates that a macsec event has occurred since the last read of the macsec interrupt cause. use the ?read macsec parameters? command with ?macsec interrupt cause? parameter to read the interrupt cause 3 driver valid indication 0b = lan driver is not alive. 1b = lan driver is alive. 2 interrupt pending indication 1b = lan interrupt line is asserted. 0b = lan interrupt line is not asserted. 1 interrupt cause register (icr0 read/write 1b = icr register was read since the last read status cycle. 0b = icr register was not read since the last read status cycle. reading the icr indicates that the driver has dealt with the interrupt that was asserted. 0 reserved reserved note: when the 82576 is in teaming mode, the bits listed in status data byte 2 represent both ports: 1. the lan device driver alive indication is set if one of the lan device drivers is alive. 2. the lan interrupt is considered asserted if one of the interrupt lines is asserted. 3. the icr is considered read if one of the icrs was read (lan 0 or lan 1). table 10-20. status data byte 2 (bits 2 and 1) previous current description don?t care 00b interrupt is not pending (ok). 00b 01b new interrupt is asserted (ok). 10b 01b new interrupt is asserted (ok). 11b 01b interrupt is waiting for reading (ok). 01b 01b interrupt is waiting for reading by the driver for more than one read cycle (not ok). possible drive hang state. don?t care 11b previous interrupt was read and current interrupt is pending (ok). don?t care 10b interrupt is not pending (ok).
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 776 this command returns the mac address configured in nvm offset 0. get system mac address format: data returned from the 82576: 10.5.10.2.4 read management parameters in order to read the management parameters the mc should execute two smbus transactions. the first transaction is a block write that sets the parameter that the mc wants to read. the second transaction is block read that reads the parameter. block write transaction: following the block write the mc should issue a block read that reads the parameter that was set in the block write command: data returned: the returned data is in the same format as the mc command. function command get system mac address 0xd4 function byte count data 1 (op-code) data 2 ? data 7 get system mac address 7 0xd4 mac address msb ? mac address lsb function command byte count data 1 management control request 0xc1 1 parameter number function command read management parameter 0xd1 function byte count data 1 (op-code) data 2 data 3 ? data n read management parameter n 0xd1 parameter number parameter dependent
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 777 the returned data is as follow: the parameter that is returned might not be the parameter requested by the mc. the mc should verify the parameter number (default parameter to be returned is 0x1). if the parameter number is 0xff, it means that the data that was requested from the 82576 is not ready yet.the mc should retry the read transaction. it is responsibility of the mc to follow the procedure previously defined. when the mc sends a block read command (as previously described) that is not preceded by a block write command with bytecount=1, the 82576 sets the parameter number in the read block transaction to be 0xfe. 10.5.10.2.5 read management receive filter parameters in order to read the mng rcv filter parameters, the mc should execute two smbus transactions. the first transaction is a block write that sets the parameter that the mc wants to read. the second transaction is block read that read the parameter. block write transaction: the different parameters supported for this command are the same as the parameters supported for update mng receive filter parameters. following the block write the mc should issue a block read that reads the parameter that was set in the block write command: data returned from the 82576: parameter # parameter data keep phy link up 0x00 a single byte parameter: data 2 ? bit 0 set to indicate that the phy link for this port should be kept up. sets the keep_phy_link_up bit. when cleared, clears the keep_phy_link_up bit. bit [7:1] reserved. wrong parameter request 0xfe returned by the 82576 only. this parameter is returned on read transaction, if in the previous read command the mc sets a parameter that is not supported by the 82576. the 82576 is not ready 0xff returned by the 82576 only, on read parameters command when the data that should have been read is not ready. this parameter has no data. the mc should retry the read transaction. function command byte count data 1 data 2 update mng rcv filter parameters 0xcc 1 or 2 parameter number parameter data function command request mng rcv filter parameters 0xcd
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 778 the parameter that is returned might not be the parameter requested by the mc. the mc should verify the parameter number (default parameter to be returned is 0x1). if the parameter number is 0xff, it means that the data that was requested from the 82576 should supply is not ready yet. the mc should retry the read transaction. it is mc responsibility to follow the procedure previously defined. when the mc sends a block read command (as previously described) that is not preceded by a block write command with bytecount=1, the 82576 sets the parameter number in the read block transaction to be 0xfe. function byte count data 1 (op- code) data 2 data 3 ? data n read mng rcv filter parameters n 0xcd parameter number parameter dependent parameter # parameter data filters enable 0x01 none mng2host configuration 0x0a none fail-over configurations 0x0b none. flex filter 0 enable mask and length 0x10 none flex filter 0 data 0x11 data 2 ? group of flex filter?s bytes: 0x0 = bytes 0-29 0x1 = bytes 30-59 0x2 = bytes 60-89 0x3 = bytes 90-119 0x4 = bytes 120-127 flex filter 1 enable mask and length 0x20 none flex filter 1 data 0x21 same as parameter 0x11 but for filter 1. flex filter 2 enable mask and length 0x30 none flex filter 2 data 0x31 same as parameter 0x11 but for filter 2. flex filter 3 enable mask and length 0x40 none flex filter 3 data 0x41 same as parameter 0x11 but for filter 3. filters valid 0x60 none decision filters 0x61 one byte to define the accessed manageability decision filter (mdef) data 2 ? decision filter number vlan filters 0x62 one byte to define the accessed vlan tag filter (mavtv) data 2 ? vlan filter number flex ports filters 0x63 one byte to define the accessed manageability flex port filter (mfutp). data 2 ? flex port filter number ipv4 filter 0x64 one byte to define the accessed ipv4 address filter (mipaf) data 2 ? ipv4 address filter number
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 779 10.5.10.2.6 read receive enable configuration the mc uses this command to read the receive configuration data. this data can be configured when using receive enable command or through the nvm. read receive enable configuration command format (smbus read block) is as follows: data returned from the 82576: see also: section 10.5.10.1.3, receive enable command . 10.5.10.2.7 read macsec parameters in order to read the mng macsec parameters, the mc should execute two smb transactions. the first transaction is a block write that sets the parameter that the mc wants to read. the second transaction is block read that reads the parameter. this is the block write transaction: ipv6 filters 0x65 one byte to define the accessed ipv6 address filter (mipaf) data 2 ? pv6 address filter number mac filters 0x66 one byte to define the accessed mac address filters pair (mmal, mmah) data 2 ? mac address filters pair number (0-3) ethertype filters 0x67 1 byte to define ethertype filters (metf) data 2 ? metf filter index (valid values are 0,1,2 and 3) extended decision filter 0x68 1 byte to define the extended decisions filters (mdef_ext & mdef) data 2 ? mdef filter index (valid values are 0...6) wrong parameter request 0xfe returned by the 82576 only. this parameter is returned on read transaction, if in the previous read command the mc sets a parameter that is not supported by the 82576. the 82576 is not ready 0xff returned by the 82576 only, on read parameters command when the data that should have been read is not ready. this parameter has no data.) function command read receive enable 0xda function byte count data 1 (op- code) data 2 data 3 ? data 8 data 9 ? data 12 data 13 data 14 data 15 read receive enable 15 (0x0 f) 0xda receiv e contr ol byte ma c add r ms b ? mac addr lsb ip add r msb ? ip addr lsb mc smbu s addr i/f data byte aler t valu e byt e
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 780 the table below shows the different parameters and their contents: following the block write the mc should issue a block read that will read the parameter that was set in the block write command: the table below shows the different parameters and their contents : function command byte count data 1 data 2 update mng rcv filter parameters 0xc9 1 parameter number parameter data parameter # parameter data macsec interrupt cause 0x0 none macsec rx parameters 0x1 none macsec tx parameters 0x2 none table 10-22. read macsec parameters command format (smbus read block protocol) function command byte count data 1 data 2 - n read macsec parameters 0xd9 2,18 or 22 parameter number parameter data table 10-23. read macsec parameters command parameters parameter # parameter data macsec interrupt cause 0x0 this command shall return 1 byte (data2). this byte contains the macsec interrupt cause, according to the following values: data2: bit 0 ? tx key packet number (pn) threshold met bit 1 ? host requested ownership bit 2 ? host released ownership bit 3 - reserved bit 4 - macsec configuration lost bit 5... 8 - reserved
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 781 macsec rx parameters 0x1 data 2: reserved data 3: macsec ownership status. see table 10-24 data 4: macsec host control status. see table 10-25 data 5: rx port identifier (msb) data 6: rx port identifier (lsb) data 7: rx sci (msb) ? data 12: rx sci (lsb) data 13: reserved data 14: rx sa an - the association number currently used for the active sa data 15: rx sa packet number (msb) - ? data 18: rx sa packet number (lsb) rx sa packet number is the last packet number, as read from the last valid rx macsec packet macsec tx parameters 0x2 data 2: reserved data 3: macsec ownership status. see table 10-24 data 4: macsec host control status. see table 10-25 data 5: tx port identifier (msb) data 6: tx port identifier (lsb) note: tx port identifier is reserved to 0x0 for this implementation. data 7: tx sci (msb) ? data 12: tx sci (lsb) data 13: reserved data 14: tx sa an - the association number currently used for the active sa data 15: tx sa packet number (msb) ? data 18: tx sa packet number (lsb) data 19: packet number threshold (msb) ? data 21: packet number threshold (lsb) tx sa packet number is the last packet number, as read from the last valid tx macsec packet. data 22: tx control status. see table 10-26 table 10-24. macsec owner status value description 0x0 host is macsec owner 0x1 bmc is macsec owner table 10-23. read macsec parameters command parameters (continued)
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 782 10.5.11 lan fail-over in lan teaming mode manageability fail-over is the ability in a dual-port network device (the 82576) to detect that the lan connection on the manageability enabled port is lost and to enable the other port in the device to receive/transmit manageability packets. when the 82576 operates in teaming mode, the os and external mc consider the 82576 as one logical network device. the decision to determine which of the the 82576 ports to use is done internally in the 82576 (or in the ans driver in case of the regular receive/transmit traffic). this section deals with fail-over in teaming mode only. in non-teaming mode, the external mc should consider the 82576's network ports as two different network devices, and is solely responsible for the fail-over mechanism. 10.5.11.1 fail-over functionality in teaming mode, the 82576 mirrors both the network ports into a single smbus slave device. the 82576 automatically handles the configurations of both network ports. thus, for configurations, receiving and transmitting the mc should consider both ports as a single entity. when the currently active port for transmission becomes unavailable (for instance, the link is down), the 82576 automatically tries to switch the packet transmission to the other port. thus, as long as one of the ports is valid, the mc has a valid link indication for the smbus slave. 10.5.11.1.1 transmit functionality in order to transmit a packet, the mc should issue the appropriate smbus packet transmission commands to the 82576. the 82576 will then automatically choose the transmission port. 10.5.11.1.2 receive functionality when the 82576 receives a packet from any of the teamed ports, it notifies and forwards the packet to the mc. table 10-25. macsec host control status bit description 0 reserved 1 allow host traffic: 0b = host traffic is blocked 1b = host traffic is allowed 2..7 reserved table 10-26. tx control status bit description 0..4 reserved 5 include sci: 0b = do not include sci in tx packets 1b = include sci in tx packets 6..7 reserved
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 783 as both ports might be active (for instance, with a valid link), packets might be received on the currently non-active port. to avoid this, fail-over should be used only in a switched network. 10.5.11.1.3 port switching (fail-over) while in teaming mode, transmit traffic is always transmitted by the 82576 through only one of the ports at any given time. the 82576 might switch the traffic transmission between ports under any of the following conditions: 1. the current transmitting port link is not available. 2. the preferred primary port is enabled and becomes available for transmission. 10.5.11.1.4 device driver interactions when the lan device driver is present, the decision to switch between the two ports is done by the device driver. when the device driver is absent, this decision is done internally by the 82576. when the device driver releases teaming mode, such as when the system state changes, the 82576 reconfigure the lan ports to teaming mode. the 82576 accomplishes this by re-setting the mac address of the two ports to be the teaming address in order to re-start teaming. this is followed by transmitting gratuitous arp packets to notify the network of teaming mode re-setting. 10.5.11.2 fail-over configuration fail-over operation is configured through the fail-over register, as described in table 10-27 . the mc should configure this register after every initialization indication from the 82576 (such as after every firmware reset). the mc needs to use the update management receive filters command, with parameter 0x0a. see section 10.5.10.1.5.1 . the configurations available to the mc are detailed in this section. in teaming mode, both ports should be configured with the same receive manageability filters parameters (eeprom sections for port 0 and port 1 should be identical). 10.5.11.2.1 preferred primary port the mc might choose one of the network ports (lan0 or lan1) as a preferred primary port for packet transmission. the 82576 uses the preferred primary port as the transmission port each time the link for that port is valid. for example, the 82576 always switches back to the preferred primary port when available. 10.5.11.2.2 gratuitous arps in order to notify the link partner that a port switching has occurred, the 82576 can be configured to automatically send gratuitous arps. gratuitous arps cause the link partner to update its arp tables to reflect the change. the mc might enable/disable gratuitous arps, configure the number of gratuitous arps, or the interval between them by modifying the fail-over register.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 784 10.5.11.2.3 link down timeout the mc can control the timeout for a link to be considered invalid. the 82576 waits on this timeout before attempting to switch from an inactive port. 10.5.11.3 fail-over register this register is loaded at power up from the nvm or through the by the lan driver. the mc can change the contents of the fail-over register using the update management receive filters command, with parameter 0x0a. see section 10.5.10.1.5.1 . table 10-27 lists register bits. table 10-27. fail-over register bits field initial value read/ write description 0 receive management port 0 enable (rmp0en) 0x1 ro rcv mng port 0 enable. when this bit is set, it reports that management traffic will be received from port 0. 1 receive management port 1 enable (rmp1en) 0x1 ro rcv mng port 1 enable. when this bit is set, it reports that management traffic will be received from port 1. 2 management transmit port (mxp) 0x0 ro mng xmt port. 0b ? reports that management traffic should be transmitted through port 0. 1b ? reports that mng traffic should be transmitted through port 1. 3 preferred primary port (prpp) 0x0 rw preferred primary port. 0b ? port 0 is the preferred primary port. 1b ? port 1 is the preferred primary port. 4 preferred primary port enable (prppe) 0x0 rw preferred primary port enables. 5 reserved 0x0 ro reserved 6 repeated gratuitous arp enable (rgaen) 0x0 rw repeated gratuitous arp enable. if this bit is set, the 82576 sends a configurable number of gratuitous arp packets (gac bits of this register) using configurable interval (gati bits of this register) after the following events: ? system move to dx. ? fail-over event initiated the 82576. 8:7 reserved 0x0 ro reserved 9 teaming fail- over enable on dx (tfoenodx) 0x0 rw teaming fail-over enable on dx. enable fail-over mechanism. bits 3:8 are valid only if this bit is set. 10:1 1 reserved 0x0 ro reserved
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 785 10.5.12 example configuration steps this section provides sample configuration settings for common filtering configurations. three examples are presented. the examples are in pseudo code format, with the name of the smbus command followed by the parameters for that command and an explanation. 10.5.12.1 example 1 - shared mac, rmcp only ports this example is the most basic configuration. the mac address filtering is shared with the host operating system and only traffic directed the rmcp ports (26fh & 298h) is filtered. for this example, the mc must issue gratuitous arps because no filter is enabled to pass arp requests to the bmc. 10.5.12.1.1 example 1 pseudo code step 1: disable existing filtering receive enable[00] utilizi ng the simple form of the receive enable command, this prevents any packets from reaching the mc by disabling filtering: receive enable control 00h: ? bit 0 [0] ? disable receiving of packets step 2: configure mdef[0] update manageability filter parameters [61, 0, 00000c00] use the update manageability filter parameters command to update decision filters (mdef) (parameter 61h). this will update mdef[0], as indicated by the 2nd parameter (0). mdef[0] value of 00000c00h: ? bit 10 [1] ? port 298h ? bit 11 [1] ? port 26fh step 3: - enable filtering 12:1 5 gratuitous arp counter (gac) 0x0 rw gratuitous arp counter. indicates the number of gratuitous arp that should be sent after a fail- over event and after move to dx. the value of 0b means that there is no limit on the gratuitous arp packets to be sent. 16:2 3 link down fail- over time (ldfot) 0x0 rw link down fail-over time. defines the time (in seconds) the link should be down before doing a fail-over to the second port. this is also the time that the primary link should be up (after it was down) before the 82576 will fail-over back to the primary port. 24:3 1 gratuitous apr transmission interval (gati) 0x0 rw gratuitous arp transmission interval. defines the interval in seconds before retransmission of gratuitous arp packets. table 10-27. fail-over register
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 786 receive enable [05] using the simple form of the receive enable command: receive enable control 05h: ? bit 0 [1] ? enable receiving of packets ? bit 2 [1] ? enable status reporting (such as link lost) ? bit 5:4 [00] ? notification method = smb alert ? bit 7 [0] ? use shared mac 10.5.12.2 example 2 - dedicated mac, auto arp response and rmcp port filtering this example shows a common configuration; the mc has a dedicated mac and ip address. automatic arp responses will be enabled as well as rmcp port filtering. by enabling automatic arp responses the mc is not required to send the gratuitous arps as it did in example 1. since arp requests are now filtered, in order for the host to receive the arp requests, the manageability to host filter will be configured to send the arp requests to the host as well. for demonstration purposes, the dedicated mac address will be calculated by reading the system mac address and adding 1 do it, assume the system mac is aabbccdc. the ip address for this example will be 1.2.3.4. additionally, the xsum filtering will be enabled. note that not all intel ethernet controllers support automatic arp responses, please refer to product specific documentation. 10.5.12.2.1 example 2 - pseudo code step 1: disable existing filtering table 10-28. example 1 mdef results manageability decision filter (mdef) filter 0 1 2 3 4 5 6 7 l2 unicast address and broadcast and manageability vlan and ip address and l2 unicast address or broadcast or multicast and arp request or arp response or neighbor discovery or port 0x298 orx port 0x26f or x flex port 15:0 or flex tco 3:0 or
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 787 receive enable[00] utilizing the simple form of the receive enable command, this prevents any packets from reaching the mc by disabling filtering: receive enable control 00h: ? bit 0 [0] ? disable receiving of packets step 2: read system mac address get system mac address [] reads the system mac address. assume returned aabbccdc for this example. step 3: configure xsum filter update manageability filter parameters [01, 00800000] use the update manageability filter parameters command to update filters enable settings (parameter 1). this set the manageability control (manc) register. manc register 00800000h: ? bit 23 [1] - xsum filter enable note that some of the following configuration steps manipulate the manc register indirectly, this command sets all bits except xsum to 0. it is important to either do this step before the others, or to read the value of the manc and then write it back with only bit 32 changed. also note that the xsum enable bit may differ between ethernet controllers, refer to product specific documentation. step 4: configure mdef[0] update manageability filter parameters [61, 0, 00000c00] use the update manageability filter parameters command to update decision filters (mdef) (parameter 61h). this will update mdef[0], as indicated by the 2nd parameter (0). mdef value of 00000c00h: ? bit 10 [1] ? port 298h ? bit 11 [1] ? port 26fh step 5: configure mdef[1] update manageability filter parameters [61, 1, 00000080] use the update manageability filter parameters command to update decision filters (mdef) (parameter 61h). this will update mdef[1], as indicated by the 2 nd parameter (1). mdef value of 00000080: ? bit 7 [7] ? arp requests when enabling automatic arp responses, the arp requests still go into the manageability filtering system and as such need to be designated as also needing to be sent to the host. for this reason a separate mdef is created with only arp request filtering enabled. refer to the next step for more details. step 6: configure the management to host filter update manageability filter parameters [0a, 00000002] use the update manageability filter parameters command to update the management control to host (manc2h) register.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 788 manc2h register 00000002: ? bit 2 [1] ? enable mdef[1] traffic to go to host as well this allows arp requests to be passed to both manageability and to the host. specified separate mdef filter for arp requests. if arp requests had been added to mdef[0] and then mdef[0] specified in management to host configuration then not only would arp requests be sent to the mc and host, rmcp traffic (ports 26fh and 298h) would have also been sent to both places. step 7: enable filtering receive enable [8d, aabbccdd, 01020304, 00, 00, 00] using the advanced version receive enable command, the first parameter: receive enable control 8dh: ? bit 0 [1] ? enable receiving of packets ? bit 2 [1] ? enable status reporting (such as link lost) ? bit 3 [1] ? enable automatic arp responses ? bit 5:4 [00] ? notification method = smb alert ? bit 7 [1] - use dedicated mac second parameter is the mac address (aabbccdd). third parameter is the ip address(01020304). the last three parameters are zero when the notification method is smb alert . 10.5.12.3 example 3 - dedicated mac & ip address this example provided the mc with a dedicated mac and ip address and allows it to receive arp requests. the mc is then responsible for responding to arp requests. table 10-29. example 2 mdef results manageability decision filter (mdef) filter 0 1 2 3 4 5 6 7 l2 unicast address and x broadcast and manageability vlan and ip address and l2 unicast address or broadcast or multicast and arp request or x arp response or neighbor discovery or port 0x298 orx port 0x26f or x flex port 15:0 or flex tco 3:0 or
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 789 for demonstration purposes, the dedicated mac address will be calculated by reading the system mac address and adding 1 do it, assume the system mac is aabbccdc. the ip address for this example will be 1.2.3.4. for this example, the receive enable command is used to configure the mac address filter. in order for the mc to be able to receive arp requests, it will need to specify a filter for this, and that filter will need to be included in the manageability to host filtering so that the host os may also receive arp requests. 10.5.12.3.1 example 3 - pseudo code step 1: disable existing filtering receive enable[00] utilizing the simple form of the receive enable command, this prevents any packets from reaching the mc by disabling filtering: receive enable control 00h: ? bit 0 [0] ? disable receiving of packets step 2: read system mac address get system mac address [] reads the system mac address. assume returned aabbccdc for this example. step 3: configure ip address filter update manageability filter parameters [64, 00, 01020304] use the update manageability filter parameters to configure an ipv4 filter. the 1st parameter (64h) specifies that we are configuring an ipv4 filter. the 2nd parameter (00h) indicates which ipv4 filter is being configured, in this case filter 0. the 3rd parameter is the ip address ? 1.2.3.4. step 4: configure mac address filter update manageability filter parameters [66, 00, aabbccdd] use the update manageability filter parameters to configure a mac address filter. the 1st parameter (66h) specifies that we are configuring a mac address filter. the 2nd parameter (00h) indicates which mac address filter is being configured, in this case filter 0. the 3rd parameter is the mac address - aabbccdd step 5: configure mdef[0] for ip and mac filtering update manageability filter parameters [61, 0, 00000009] use the update manageability filter parameters command to update decision filters (mdef) (parameter 61h). this will update mdef[0], as indicated by the 2nd parameter (0). mdef value of 00000009: ? bit 0 [1] ? mac address filtering ? bit 3 [1] ? ip address filtering
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 790 step 6: configure mdef[1] update manageability filter parameters [61, 1, 00000080] use the update manageability filter parameters command to update decision filters (mdef) (parameter 61h). this will update mdef[1], as indicated by the 2nd parameter (1). mdef value of 00000080: ? bit 7 [1] ? arp requests when filtering arp requests the requests go into the manageability filtering system and as such need to be designated as also needing to be sent to the host. for this reason a separate mdef is created with only arp request filtering enabled. step 7: configure the management to host filter update manageability filter parameters [0a, 00000002] use the update manageability filter parameters command to update the management control to host (manc2h) register. manc2h register 00000002: ? bit 2 [1] ? enable mdef[1] traffic to go to host as well step 8: enable filtering receive enable [05] using the simple form of the receive enable command,: receive enable control 05h: ? bit 0 [1] ? enable receiving of packets ? bit 2 [1] ? enable status reporting (such as link lost) ? bit 5:4 [00] ? notification method = smb alert the resulting mdef filters are as follows: table 10-30. example 3 mdef results manageability decision filter (mdef) filter 0 1 2 3 4 5 6 7 l2 unicast address and x broadcast and manageability vlan and ip address and x l2 unicast address or broadcast or multicast and arp request or x arp response or neighbor discovery or
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 791 10.5.12.4 example 4 - dedicated mac and vlan tag this example shows an alternate configuration; the mc has a dedicated mac and ip address, along with a vlan tag of 32h will be required for traffic to be sent to the bmc. this means that all traffic with vlan a matching tag will be sent to the bmc. for demonstration purposes, the dedicated mac address will be calculated by reading the system mac address and adding 1 do it, assume the system mac is aabbccdc. the ip address for this example will be 1.2.3.4 and the vlan tag will be 0032h. it is assumed the host will not be using the same vlan tag as the bmc. if they were to share the same vlan tag then additional filtering would need to be configured to allow vlan tagged non-unicast (such as arp requests) to be sent to the host as well as the mc using the manageability to host filter capability. additionally, the xsum filtering will be enabled. 10.5.12.4.1 example 4 - pseudo code step 1: disable existing filtering receive enable[00] utilizing the simple form of the receive enable command, this prevents any packets from reaching the mc by disabling filtering: receive enable control 00h: ? bit 0 [0] ? disable receiving of packets step 2: - read system mac address get system mac address [] reads the system mac address. assume returned aabbccdc for this example. step 3: configure xsum filter update manageability filter parameters [01, 00800000] use the update manageability filter parameters command to update filters enable settings (parameter 1). this set the manageability control (manc) register. manc register 00800000h: ? bit 23 [1] ? xsum filter enable port 0x298 or port 0x26f or flex port 15:0 or flex tco 3:0 or table 10-30. example 3 mdef results
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 792 note that some of the following configuration steps manipulate the manc register indirectly, this command sets all bits except xsum to 0. it is important to either do this step before the others, or to read the value of the manc and then write it back with only bit 32 changed. also note that the xsum enable bit may differ between ethernet controllers, refer to product specific documentation. step 4: configure vlan 0 filter update manageability filter parameters [62, 0, 0032] use the update manageability filter parameters command to configure vlan filters. parameter 62h indicates update to vlan filter, the 2nd parameter indicates which vlan filter (0 in this case), the last parameter is the vlan id (0032h). step 5: configure mdef[0] update manageability filter parameters [61, 0, 00000040] use the update manageability filter parameters command to update decision filters (mdef) (parameter 61h). this will update mdef[0], as indicated by the 2nd parameter (0). mdef value of 0000004: ? bit 2 [1] ? vlan and step 6: enable filtering receive enable [85, aabbccdd, 01020304, 00, 00, 00] using the advanced version receive enable command, the first parameter: receive enable control 85h: ? bit 0 [1] ? enable receiving of packets ? bit 2 [1] ? enable status reporting (such as link lost) ? bit 5:4 [00] ? notification method = smb alert ? bit 7 [1] ? use dedicated mac second parameter is the mac address: aabbccdd. third parameter is the ip address: 01020304. the last three parameters are zero when the notification method is smbus alert. table 10-31. example 4 mdef results manageability decision filter (mdef) filter 0 1 2 3 4 5 6 7 l2 unicast address and x broadcast and manageability vlan and x ip address and l2 unicast address or broadcast or multicast and
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 793 10.5.13 smbus troubleshooting this section outlines the most common issues found while working with pass-through using the smbus sideband interface. 10.5.13.1 tco alert line stays asserted after a power cycle after the 82576 resets, both its ports indicates a status change. if the mc only reads status from one port (slave address), the other one will continue to assert the tco alert line. ideally, the mc should use the ara transaction (see section 10.5.9 ) to determine which slave asserted the tco alert. many customers only wish to use one port for manageability thus using ara might not be optimal. an alternate to using ara is to configure one of the ports to not report status and to set its smbus timeout period. in this case, the smbus timeout period determines how long a port asserts the tco alert line awaiting a status read from a mc; by default this value is zero (indicates an infinite timeout). the smbus configuration section of the eeprom has a smbus notification timeout (ms) field that can be set to a recommended value of 0xff (for this issue). note that this timeout value is for both slave addresses. along with setting the smbus notification timeout to 0xff, it is recommended that the second port be configured in the eeprom to disable status alerting. this is accomplished by having the enable status reporting bit set to 0b for the desired port in the lan configuration section of the eeprom. the third solution for this issue is to have the mc hard-code the slave addresses to always read from both ports. as with the previous solution, it is recommend that the second port have status reporting disabled. 10.5.13.2 when smbus commands are always nack'd there are several reasons why all commands sent to the 82576 from a mc could be nack'd. the following are most common: ? invalid nvm image ? the image itself might be invalid or it could be a valid image and is not a pass-through image, as such smbus connectivity is disabled. arp request or arp response or neighbor discovery or port 0x298 or port 0x26f or flex port 15:0 or flex tco 3:0 or table 10-31. example 4 mdef results
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 794 ? the mc is not using the correct smbus address ? many mc vendors hard-code the smbus address(es) into their firmware. if the incorrect values are hard-coded, the 82576 does not respond. ? the smbus address(es) can be dynamically set using the smbus arp mechanism. ? the mc is using the incorrect smbus interface ? the eeprom might be configured to use one physical smbus port; however, the mc is physically connected to a different one. ? bus interference ? the bus connecting the mc and the 82576 might be unstable. 10.5.13.3 smbus clock speed is 16.6666 khz this can happen when the smbus connecting the mc and the 82576 is also tied into another device (such as an ich) that has a maximum clock speed of 16.6666 khz. the solution is to not connect the smbus between the 82576 and the mc to this device. 10.5.13.4 a network based host application is not receiving any network packets reports have been received about an application not receiving any network packets. the application in question was nfs under linux. the problem was that the application was using the rmpc/rmcp+ iana reserved port 0x26f (623) and the system was also configured for a shared mac and ip address with the os and mc. the management control to host configuration, in this situation, was setup not to send rmcp traffic to the os (this is typically the correct configuration). this means that no traffic send to port 623 was being routed. the solution in this case is to configure the problematic application not to use the reserved port 0x26f. 10.5.13.5 unable to transmit packets from the mc if the mc has been transmitting and receiving data without issue for a period of time and then begins to receive nacks from the 82576 when it attempts to write a packet, the problem is most likely due to the fact that the buffers internal to the 82576 are full of data that has been received from the network but has yet to be read by the mc. being an embedded device, the 82576 has limited buffers that are shared for receiving and transmitting data. if a mc does not keep the incoming data read, the 82576 can be filled up this prevents the mc form transmitting more data, resulting in nacks. if this situation occurs, the recommended solution is to have the mc issue a receive enable command to disable more incoming data, read all the data from the 82576, and then use the receive enable command to enable incoming data. 10.5.13.6 smbus fragment size the smbus specification indicates a maximum smbus transaction size of 32 bytes. most of the data passed between the 82576 and the mc over the smbus is rmcp/rmcp+ traffic, which by its very nature (udp traffic) is significantly larger than 32 bytes in length. multiple smbus transactions may therefore be required to move data from the 82576 to the mc or to send a data from the mc to the 82576.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 795 recognizing this bottleneck, the 82576 handles up to 240 bytes of data in a single transaction. this is a configurable setting in the nvm. the default value in the nvm images is 32, per the smbus specification. if performance is an issue, increase this size. during initialization, firmware within the 82576 allocates buffers based upon the smbus fragment size setting within the nvm. the 82576 firmware has a finite amount of ram for its use: the larger the smbus fragment size, the fewer buffers it can allocate. because this is true, mc implementations must take care to send data over the smbus efficiently. for example, the 82576 firmware has 3 kb of ram it can use for buffering smbus fragments. if the smbus fragment size is 32 bytes then the firmware could allocate 96 buffers of size 32 bytes each. as a result, the mc could then send a large packet of data (such as kvm) that is 800 bytes in size in 25 fragments of size 32 bytes apiece. however, this might not be the most efficient way because the mc must break the 800 bytes of data into 25 fragments and send each one at a time. if the smbus fragment size is changed to 240 bytes, the 82576 firmware can create 12 buffers of 240 bytes each to receive smbus fragments. the mc can now send that same 800 bytes of kvm data in only four fragments, which is much more efficient. the problem of changing the smbus fragment size in the nvm is if the mc does not also reflect this change. if a programmer changes the smbus fragment size in the 82576 to 240 bytes and then wants to send 800 bytes of kvm data, the mc can still only send the data in 32 byte fragments. as a result, firmware runs out of memory. this is because firmware created the 12 buffers of 240 bytes each for fragments; however, the mc is only sending fragments of size 32 bytes. this results in a memory waste of 208 bytes per fragment. then when the mc attempts to send more than 12 fragments in a single transaction, the 82576 nacks the smbus transaction due to not enough memory to store the kvm data. in summary, if a programmer increases the size of the smbus fragment size in the nvm (recommended for efficiency purposes) take care to ensure that the mc implementation reflects this change and uses that fragment size to its fullest when sending smbus fragments. 10.5.13.7 losing link normal behavior for the ethernet controller when the system powers down or performs a reset is for the link to temporarily go down and then back up again to re-negotiate the link speed. this behavior can have adverse affects on manageability. for example if there is an active ftp or serial over lan session to the mc, this connection may be lost. in order to avoid this possible situation, the mc can use the management control command detailed in section 10.5.10.1.5 to ensure the link stays active at all times. this command is available when using the nc-si sideband interface as well. care should be taken with this command, if the driver negotiates the maximum link speed, the link speed will remain the same when the system powers down or resets. this may have undesirable power consumption consequences. currently, when using nc-si, the mc can re-negotiate the link speed. that functionality is not available when using the smbus interface.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 796 10.5.13.8 enable xsum filtering if xsum filtering is enabled, the mc does not need to perform the task of checking this checksum for incoming packets. only packets that have a valid xsum is passed to the mc. all others are silently discarded. this is a way to offload some work from the mc. 10.5.13.9 still having problems? if problems still exist, contact your field representative. be prepared to provide the following: ? a smbus trace if possible ? a dump of the nvm image. this should be taken from the actual 82576, rather than the nvm image provided by intel. parts of the nvm image are changed after writing (such as the physical nvm size). 10.6 nc-si pass through interface the network controller sideband interface (nc-si) is a dmtf industry standard protocol for the sideband interface. nc-si uses a modified version of the industry standard rmii interface for the physical layer as well as defining a new logical layer. the nc-si specification can be found at: http://www.dmtf.org/ 10.6.1 overview 10.6.1.1 terminology the terminology in this document is taken from the nc-si specification. table 10-32. nc-si terminology term definition frame versus packet frame is used in reference to ethernet, whereas packet is used everywhere else. external network interface the interface of the network controller that provides connectivity to the external network infrastructure (port). internal host interface the interface of the network controller that provides connectivity to the host os running on the platform. management controller (mc) an intelligent entity comprising of hw/fw/sw, that resides within a platform and is responsible for some or all management functions associated with the platform (mc, service processor, etc.). network controller (nc) the component within a system that is responsible for providing connectivity to the external ethernet network world. remote media the capability to allow remote media devices to appear as if they were attached locally to the host. network controller sideband interface the interface of the network controller that provides connectivity to a management controller. it can be shorten to sideband interface as appropriate in the context.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 797 10.6.1.2 system topology in nc-si each physical endpoint (nc package) can have several logical slaves (nc channels). nc-si defines that one management controller and up to four network controller packages can be connected to the same nc-si link. interface this refers to the entire physical interface, such as both the transmit and receive interface between the management controller and the network controller. integrated controller the term integrated controller refers to a network controller device that supports two or more channels for nc-si that share a common nc-si physical interface. for example, a network controller that has two or more physical network ports and a single nc-si bus connection. multi-drop multi-drop commonly refers to the case where multiple physical communication devices share an electrically common bus and a single device acts as the master of the bus and communicates with multiple slave or target devices. in nc-si, a management controller serves the role as the master, and the network controllers are the target devices. point-to-point point-to-point commonly refers to the case where only two physical communication devices are interconnected via a physical communication medium. the devices might be in a master/slave relationship, or could be peers. in nc-si, point-to-point operation refers to the situation where only a single management controller and single network controller package are used on the bus in a master/slave relationship where the management controller is the master. channel the control logic and data paths supporting nc-si pass-through operation on a single network interface (port). a network controller that has multiple network interface ports can support an equivalent number of nc-si channels. package one or more nc-si channels in a network controller that share a common set of electrical buffers and common buffer control for the nc-si bus. typically, there will be a single, logical nc-si package for a single physical network controller package (chip or module). however, the specification allows a single physical chip or module to hold multiple nc-si logical packages. control traffic/messages/packets command, response and notification packets transmitted between mc and ncs for the purpose of managing nc-si. pass-through traffic/messages/ packets non-control packets passed between the external network and the mc through the nc. channel arbitration refer to operations where more than one of the network controller channels can be enabled to transmit pass-through packets to the mc at the same time, where arbitration of access to the rxd, crs_dv, and rx_er signal lines is accomplished either by software of hardware means. logically enabled/disabled nc refers to the state of the network controller wherein pass-through traffic is able/unable to flow through the sideband interface to and from the management controller, as a result of issuing enable/disable channel command. nc rx defined as the direction of ingress traffic on the external network controller interface nc tx defined as the direction of egress traffic on the external network controller interface nc-si rx defined as the direction of ingress traffic on the sideband enhanced nc-si interface with respect to the network controller. nc-si tx defined as the direction of egress traffic on the sideband enhanced nc-si interface with respect to the network controller. table 10-32. nc-si terminology
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 798 figure 10-4 shows an example topology for a single mc and a single nc package. in this example, the nc package has two nc channels. figure 10-5 shows an example topology for a single mc and two nc packages. in this example, one nc package has two nc channels and the other has only one nc channel. scenarios in which the nc-si lines are shared by multiple ncs ( figure 10-5 ) mandate an arbitration mechanism. the arbitration mechanism is described in section 10.6.7.1 . figure 10-4. single nc package, two nc channels figure 10-5. two nc packages (left, with two nc channels and right, with one nc channel)
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 799 10.6.1.3 data transport since nc-si is based upon the rmii transport layer, data is transferred in the form of ethernet frames. nc-si defines two types of transmitted frames: 1. control frames: a. configures and control the interface b. identified by a unique ethertype in their l2 header 2. pass-through frames: a. actual lan pass-through frames transferred from/to the mc b. identified as not being a control frame c. attributed to a specific nc channel by their source mac address (as configured in the nc by the mc) 10.6.1.3.1 control frames nc-si control frames are identified by a unique nc-si ethertype (0x88f8). control frames are used in a single-threaded operation, meaning commands are generated only by the mc and can only be sent one at a time. each command from the mc is followed by a single response from the nc (command-response flow), after which the mc is allowed to send a new command. the only exception to the command-response flow is the asynchronous event notification (aen). these control frames are sent unsolicited from the nc to the mc. aen functionality by the nc must be disabled by default, until activated by the mc using the enable aen commands. in order to be considered a valid command, a control frame must: 1. comply with the nc-si header format. 2. be targeted to a valid channel in the package via the package id and channel id fields. for example, to target a nc channel with package id of 0x2 and internal channel id of 0x5, the mc must set the channel id inside the control frame to 0x45. the channel id is composed of three bits of package id and five bits of internal channel id. 3. contain a correct payload checksum (if used). 4. meet any other condition defined by nc-si. there are also commands (such as select package) targeted to the package as a whole. these commands must use an internal channel id of 0x1f. for details, refer to the nc-si specification. 10.6.1.3.2 nc-si frames receive flow
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 800 figure 10-6 shows the flow for frames received on the nc from the mc. 10.6.2 nc-si support 10.6.2.1 supported features the 82576 supports all the mandatory features of the nc-si specification (rev 1.0.0a). table 10-33 lists the supported commands. figure 10-6. nc-si frames receive flow for the nc table 10-33. supported nc-si commands command supported? clear initial state yes get version id yes get parameters yes get controller packet statistics no get link status yes 1 enable channel yes disable channel yes reset channel yes
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 801 table 10-34 lists optional features supported. enable vlan yes 2 disable vlan yes enable broadcast yes disable broadcast yes set mac address yes get nc-si statistics yes, partially enable nc-si flow control no disable nc-si flow control no set link yes 1 ,3 enable global multi-cast filter yes disable global multi-cast filter yes get capabilities yes set vlan filters yes aen enable yes get nc-si pass-through statistics yes, partially select package yes deselect package yes enable channel network tx yes disable channel network tx yes oem command yes 1. when working with sgmii interface, this command is not supported. 2. the 82576 does not support filtering of user priority/cfi bits of vlan 3. in cases that on of the lan devices is assigned for the sole use of the manageability and its lan pci-e function is disabled, using the nc-si set link command while advertising multiple speeds and enabling auto-negotiation, will result in the lowest possible speed chosen. to enable link of higher a speed, the mc should not advertise speeds that are below the desired link speed. when doing it, changing the power state of the lan device will have not effect and the link speed will not be re-negotiated. table 10-34. optional nc-si features support feature implement details aens yes the driver state aen may be emitted up to 15 sec. after actual driver change. get nc-si statistics yes, partially support the following counters: 1-4, 7. enable/disable global multi-cast filter yes, partially no support for specific multicast filtering. support is to either filter out all multicast packets (enable command) or pass all multicast packets to the mc (disable command). table 10-33. supported nc-si commands command supported?
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 802 10.6.2.2 nc-si mode ? intel specific commands in addition to regular nc-si commands, the following intel vendor specific commands are supported. the purpose of these commands is to provide a means for the mc to access some of the intel-specific features present in the 82576. 10.6.2.2.1 overview the following features are available via the nc-si oem specific commands: ? receive filters: ? packet addition decision filters 0x0?0x4 ? packet reduction decision filters 0x5?0x7 ? mng2host register (controls the forwarding of manageability packets to the host) ? flex 128 filters 0x0?0x3 ? flex tcp/udp port filters 0x0...0xa ? ipv4/ipv6 filters ? get system mac address ? this command enables the mc to retrieve the system mac address used by the nc. this mac address can be used for a shared mac address mode. get nc-si pass-through statistics yes, partially support the following counters: 2. support the following counters only when the os is down: 1, 6, 7. vlan modes yes, partially support only modes 1, 3. buffering capabilities yes 8 kb. mac address filters yes supports 2 mac addresses per port. channel count yes supports 2 channels. vlan filters yes support 8 vlan filters per port. broadcast filters yes support the following filters: ? arp ? dhcp ? net bios multicast filters yes supports the following filters 1 : ? ipv6 neighbor advertisement ? ipv6 router advertisement ? dhcpv6 relay and server multicast set nc-si flow control no does not support nc-si flow control. hardware arbitration yes supports nc-si hw arbitration. 1. supports only when all three filters are enabled. table 10-34. optional nc-si features support feature implement details
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 803 ? keep phy link up ( veto bit) enable/disable ? this feature enables the mc to block phy reset, which might cause session loss. ? tco reset ? enables the mc to reset the 82576. ? checksum offloading ? offloads ip/udp/tcp checksum checking from the mc. ? macsec logic programming these commands are designed to be compliant with their corresponding smbus commands (if existing). all of the commands are based on a single dmtf defined nc-si command, known as oem command. this command is as follows. 10.6.2.2.2 oem command (0x50) the oem command can be used by the mc to request the sideband interface to provide vendor-specific information. the vendor enterprise number (ven) is the unique mib/snmp private enterprise number assigned by iana per organization. vendors are free to define their own internal data structures in the vendor data fields. 10.6.2.2.3 oem response (0xd0) 10.6.2.2.4 oem specific command response reason codes bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20.. intel command number optional data bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 intel command number optional return data
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 804 response code reason code value description value description 0x1 command failed 0x5081 invalid intel command number 0x1 command failed 0x5082 invalid intel command parameter number 0x1 command failed 0x5085 internal network controller error 0x1 command failed 0x5086 invalid vendor enterprise code table 10-35. command summary intel command parameter command name 0x00 0x00 set ip filters control 0x01 0x00 get ip filters control 0x02 0x0a set manageability to host 0x10 set flexible 128 filter 0 mask and length 0x11 set flexible 128 filter 0 data 0x20 set flexible 128 filter 1 mask and length 0x21 set flexible 128 filter 1 data 0x30 set flexible 128 filter 2 mask and length 0x31 set flexible 128 filter 2 data 0x40 set flexible 128 filter 3 mask and length 0x41 set flexible 128 filter 3 data 0x61 set packet addition filters 0x63 set flex tcp/udp port filters 0x64 set flex ipv4 address filters 0x65 set flex ipv6 address filters 0x67 set ethertype filter 0x68 set packet addition extended filter 0x03 0x0a get manageability to host 0x10 get flexible 128 filter 0 mask and length 0x11 get flexible 128 filter 0 data 0x20 get flexible 128 filter 1 mask and length 0x21 get flexible 128 filter 1 data 0x30 get flexible 128 filter 2 mask and length 0x31 get flexible 128 filter 2 data
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 805 10.6.2.3 proprietary commands format 10.6.2.3.1 set intel filters control command (intel command 0x00) 0x40 get flexible 128 filter 3 mask and length 0x41 get flexible 128 filter 3 data 0x61 get packet addition filters 0x63 get flex tcp/udp port filters 0x64 get flex ipv4 address filters 0x65 get flex ipv6 address filters 0x67 get ethertype filter 0x68 get packet addition extended filter 0x04 0x00 set unicast packet reduction 0x01 set multicast packet reduction 0x02 set broadcast packet reduction 0x05 0x00 get unicast packet reduction 0x01 get multicast packet reduction 0x02 get broadcast packet reduction 0x06 n/a get system mac address 0x20 n/a set intel management control 0x21 n/a get intel management control 0x22 n/a perform tco reset 0x23 n/a enable ip/udp/tcp checksum offloading 0x24 n/a disable ip/udp/tcp checksum offloading bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x00 filter control index table 10-35. command summary
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 806 10.6.2.3.2 set intel filters control response format (intel command 0x00) 10.6.2.4 set intel filters control ? ip filters control command (intel command 0x00, filter control index 0x00) this command controls different aspects of the intel filters. where ?ip filters control? has the following format. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x00 filter control index bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x00 0x00 ip filters control (3-2) 24..27 ip filters control (1-0) bit # name description default value 0 ipv4/ipv6 mode ipv6 (0b): there are zero ipv4 filters and four ipv6 filters ipv4 (1b): there are four ipv4 filters and three ipv6 filters 1b 1..15 reserved 16 ipv4 filter 0 valid indicates if the ipv4 address configured in ipv4 address 0 is valid. 0b note: the network controller automatically sets this bit to 1b if the set intel filter ? ipv4 filter command is used for filter zero. 17 ipv4 filter 1 valid indicates if the ipv4 address configured in ipv4 address 1 is valid. 0b note: the network controller automatically sets this bit to 1b if the set intel filter ? ipv4 filter command is used for filter one. 18 ipv4 filter 2 valid indicates if the ipv4 address configured in ipv4 address 2 is valid. 0b note: the network controller automatically sets this bit to 1b if the set intel filter ? ipv4 filter command is used for filter two.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 807 10.6.2.4.1 set intel filters control ? ip filters control response (intel command 0x00, filter control index 0x00) 10.6.2.5 get intel filters control commands (intel command 0x01) 10.6.2.5.1 get intel filters control ? ip filters control command (intel command 0x01, filter control index 0x00) this command controls different aspects of the intel filters. 19 ipv4 filter 3 valid indicates if the ipv4 address configured in ipv4 address 3 is valid. 0b note: the network controller automatically sets this bit to 1b if the set intel filter ? ipv4 filter command is used for filter three. 20..23 reserved 24 ipv6 filter 0 valid indicates if the ipv6 address configured in ipv6 address 0 is valid. 0b note: the network controller automatically sets this bit to 1b if the set intel filter ? ipv6 filter command is used for filter zero. 25 ipv6 filter 1 valid indicates if the ipv6 address configured in ipv6 address 1 is valid. 0b note: the network controller automatically sets this bit to 1b if the set intel filter ? ipv6 filter command is used for filter one. 26 ipv6 filter 2 valid indicates if the ipv6 address configured in ipv6 address 2 is valid. 0b note: the network controller automatically sets this bit to 1b if the set intel filter ? ipv6 filter command is used for filter two. 27 ipv6 filter 3 valid indicates if the ipv6 address configured in ipv6 address 3 is valid. 0b note: the network controller automatically sets this bit to 1b if the set intel filter ? ipv6 filter command is used for filter three. 28..31 reserved bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x00 0x00 bits bytes 31..24 23..16 15..08 07..00
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 808 10.6.2.5.2 get intel filters control ? ip filters control response (intel command 0x01, filter control index 0x00) 10.6.2.6 set intel filters formats 10.6.2.6.1 set intel filters command (intel command 0x02) 10.6.2.6.2 set intel filters response (intel command 0x02) 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x01 0x00 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x01 0x00 ip filters control (3-2) 28..29 ip filters control (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x02 parameter number filters data (optional) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24.. 0x02 filter control index return data (optional)
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 809 10.6.2.6.3 set intel filters ? manageability to host command (intel command 0x02, filter parameter 0x0a) this command sets the mng2host register. the mng2host register controls whether pass-through packets destined to the mc are also be forwarded to the host os. the mng2host register has the following structure: 10.6.2.6.4 set intel filters ? manageability to host response (intel command 0x02, filter parameter 0x0a) bits description default 0 decision filter 0 determines if packets that have passed decision filter 0 is also forwarded to the host os. 1 decision filter 1 determines if packets that have passed decision filter 1 is also forwarded to the host os. 2 decision filter 2 determines if packets that have passed decision filter 2 is also forwarded to the host os. 3 decision filter 3 determines if packets that have passed decision filter 3 is also forwarded to the host os. 4 decision filter 4 determines if packets that have passed decision filter 4 is also forwarded to the host os. 5 unicast and mixed determines if broadcast packets are also forwarded to the host os. 6 global multicast determines if unicast and mixed packets are also forwarded to the host os. 7 broadcast determines if global multicast packets are also forwarded to the host os. 31:8 reserved reserved bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x02 0x0a manageability to host (3-2) 24..25 manageability to host (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x0a
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 810 10.6.2.6.5 set intel filters ? flex filter 0 enable mask and length command (intel command 0x02, filter parameter 0x10/0x20/0x30/0x40) the following command sets the intel flex filters mask and length. use filter parameters 0x10/0x20/ 0x30/0x40 for flexible filters 0/1/2/3 accordingly. 10.6.2.6.6 set intel filters ? flex filter 0 enable mask and length response (intel command 0x02, filter parameter 0x10/0x20/0x30/0x40) 10.6.2.6.7 set intel filters ? flex filter 0 data command (intel command 0x02, filter parameter 0x11/0x21/0x31/0x41) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x02 0x10/ 0x20/ 0x30/ 0x40/ mask byte 1 mask byte 2 24..27 .. .. .. .. 28..31 .. .. .. .. 32..35 .. .. .. .. 36..37 response code mask byte 16 reserved reserved 38 length bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x10/ 0x20/ 0x30/ 0x40 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 811 note: using this command to configure the filters data must be done after the flex filter mask command is issued and the mask is set. 10.6.2.6.8 set intel filters ? flex filter 0 data response (intel command 0x02, filter parameter 0x11/0x21/0x31/0x41) 10.6.2.6.9 set intel filters ? packet addition decision filter command (intel command 0x02, filter parameter 0x61) filter index range: 0x0..0x4. 16..19 manufacturer id (intel 0x157) 20.. 0x02 0x11/ 0x21/ 0x31/ 0x41 filter data group filter data 1 .. filter data n bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x11/ 0x21/ 0x31/ 0x41 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x02 0x61 decision filter (3-2) 24..25 decision filter (1-0) bit # name description 0 unicast (and) if set, packets must match a unicast filter. 1 broadcast (and) if set, packets must match the broadcast filter. 2 vlan (and) if set, packets must match a vlan filter. 3 ip address (and) if set, packets must match an ip filter.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 812 4 unicast (or) if set, packets must match a unicast filter or a different or filter. 5 broadcast if set, packets must match the broadcast filter or a different or filter. 6 multicast (and) if set, packets must match the multicast filter. 7 arp request (or) if set, packets must match the arp request filter or a different or filter. 8 arp response (or) if set, packets must also match the arp response filter or a different or filter. 9 neighbor discovery (or) if set, packets must also match the neighbor discovery filter or a different or filter. 10 port 0x298 (or) if set, packets must also match a fixed tcp/udp port 0x298 filter or a different or filter. 11 port 0x26f (or) if set, packets must also match a fixed tcp/udp port 0x26f filter or a different or filter. 12 flex port 0 (or) if set, packets must also match the tcp/udp port filter 0 or a different or filter. 13 flex port 1 (or) if set, packets must also match the tcp/udp port filter 1 or a different or filter. 14 flex port 2 (or) if set, packets must also match the tcp/udp port filter 2 or a different or filter. 15 flex port 3 (or) if set, packets must also match the tcp/udp port filter 3 or a different or filter. 16 flex port 4 (or) if set, packets must also match the tcp/udp port filter 4 or a different or filter. 17 flex port 5 (or) if set, packets must also match the tcp/udp port filter 5 or a different or filter. 18 flex port 6 (or) if set, packets must also match the tcp/udp port filter 6 or a different or filter. 19 flex port 7 (or) if set, packets must also match the tcp/udp port filter 7 or a different or filter. 20 flex port 8 (or) if set, packets must also match the tcp/udp port filter 8 or a different or filter. 21 flex port 9 (or) if set, packets must also match the tcp/udp port filter 9 or a different or filter. 22 flex port 10 (or) if set, packets must also match the tcp/udp port filter 10 or a different or filter. 23 flex port 11 (or) if set, packets must also match the tcp/udp port filter 11 or a different or filter. 24 reserved 25 reserved 26 reserved 27 reserved 28 flex tco 0 (or) if set, packets must also match the flex 128 tco filter 0 or a different or filter.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 813 the filtering is divided into two decisions: ? bits 0, 1, 2, 3, and 6 work in an and manner; they all must be true in order for a packet to pass (if any were set). ? bits 5 and 7-31 work in an or manner; at least one of them must be true for a packet to pass (if any were set). note: these filter settings operate according to the vlan mode, as configured according to the dmtf nc-si specification. after disabling packet reduction filters, the mc must re-set the vlan mode using the set vlan command. 10.6.2.6.10 set intel filters ? packet addition decision filter response (intel command 0x02, filter parameter 0x61) 10.6.2.6.11 set intel filters ? flex tcp/udp port filter command (intel command 0x02, filter parameter 0x63) filter index range: 0x0..0xa. 29 flex tco 1 (or) if set, packets must also match the flex 128 tco filter 1 or a different or filter. 30 flex tco 2 (or) if set, packets must also match the flex 128 tco filter 2 or a different or filter. 31 flex tco 3 (or) if set, packets must also match the flex 128 tco filter 3 or a different or filter. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x61 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x02 0x63 tcp/udp port
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 814 10.6.2.6.12 set intel filters ? flex tcp/udp port filter response (intel command 0x02, filter parameter 0x63) 10.6.2.6.13 set intel filters ? ipv4 filter command (intel command 0x02, filter parameter 0x64) note: the filters index range can vary according to the ipv4/ipv6 mode setting in the filters control command. ipv4 mode: filter index range: 0x0..0x3. ipv6 mode: no ipv4 filters. 10.6.2.6.14 set intel filters ? ipv4 filter response (intel command 0x02, filter parameter 0x64) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x63 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x02 0x64 ipv4 address (3-2) 24..25 ipv4 address (3-2) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x64
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 815 10.6.2.6.15 set intel filters ? ipv6 filter command (intel command 0x02, filter parameter 0x65) note: the filters index range can vary according to the ipv4/ipv6 mode setting in the filters control command. ipv4 mode: filter index range: 0x1..0x2. ipv6 mode: filter index range: 0x0..0x3. 10.6.2.6.16 set intel filters ? ipv6 filter response (intel command 0x02, filter parameter 0x65) if the ip filter index is larger the 3, a command failed response code will be returned, with no reason. 10.6.2.6.17 set intel filters - ethertype filter command (intel command 0x02, filter parameter 0x67) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x02 0x65 ip filter index ..ipv6 address (msb, byte 15) 24..27 .. .. .. .. 28..31 .. .. .. .. 32..35 .. .. .. .. 36..37 .. ipv6 address (lsb, byte 0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x65 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 816 where the ethertype filter has the format as described in section 8.11.9 . 10.6.2.6.18 set intel filters - ethertype filter response (intel command 0x02, filter parameter 0x67) if the ethertype filter index is different than 2 or 3, a command failed response code is returned with no reason. 10.6.2.6.19 set intel filters - packet addition extended decision filter command (intel command 0x02, filter parameter 0x68) decisionfilter0 bits 5,7-31 and decisionfilter1 bits 8..10 work in an ?or? manner ? thus, at least one of them must be true for a packet to pass (if any were set). see figure 10-2 for description of the decision filters structure. the command shall overwrite any previously stored value. note: previous ?set intel filters ? packet addition decision filter? command (0x61) should be kept and supported. for legacy reasons - if previous ?decision filter? command is called ? it should set the decision filter 0 as provided. the extended decision filter remains unchanged. 16..19 manufacturer id (intel 0x157) 20..23 0x02 0x67 ethertype filter index ethertype filter msb 24..27 .. .. ethertype filter lsb table 10-36. ethertype usage filter # usage note 0-1 reserved not available for generic use 2 user defined should not be used in macsec mode. 3 user defined should not be used in macsec mode. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x67 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 817 extended decision filter index range: 0..4 filter 0: see table 10-37 . filter 1: see table 10-38 . 16..19 manufacturer id (intel 0x157) 20..23 0x02 0x68 extended decision filter index extended decision filter 1 msb 24..27 .. .. extended decision filter 1 lsb extended decision filter 0 msb 28..30 .. .. extended decision filter 0 lsb table 10-37. filter values bit # name description 0 unicast (and) if set, packets must match a unicast filter 1 broadcast (and) if set, packets must match the broadcast filter 2 vlan (and) if set, packets must match a vlan filter 3 ip address (and) if set, packets must match an ip filter 4 unicast (or) if set, packets must match a unicast filter or a different ?or? filter 5 broadcast if set, packets must match the broadcast filter or a different ?or? filter 6 multicast (and) if set, packets must match the multicast filter 7 arp request (or) if set, packets must match the arp request filter or a different or filter 8 arp response (or) if set, packets can pass if match the arp response filter 9 neighbor discovery (or) if set, packets can pass if match the neighbor discovery filter 10 port 0x298 (or) if set, packets can pass if match a fixed tcp/udp port 0x298 filter 11 port 0x26f (or) if set, packets can pass if match a fixed tcp/udp port 0x26f filter 12 flex port 0 (or) if set, packets can pass if match the tcp/udp port filter 0 13 flex port 1 (or) if set, packets can pass if match the tcp/udp port filter 1 14 flex port 2 (or) if set, packets can pass if match the tcp/udp port filter 2 15 flex port 3 (or) if set, packets can pass if match the tcp/udp port filter 3 16 flex port 4 (or) if set, packets can pass if match the tcp/udp port filter 4 17 flex port 5 (or) if set, packets can pass if match the tcp/udp port filter 5 18 flex port 6 (or) if set, packets can pass if match the tcp/udp port filter 6 19 flex port 7 (or) if set, packets can pass if match the tcp/udp port filter 7
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 818 10.6.2.6.20 set intel filters ? packet addition extended decision filter response (intel command 0x02, filter parameter 0x68) 20 flex port 8 (or) if set, packets can pass if match the tcp/udp port filter 8 21 flex port 9 (or) if set, packets can pass if match the tcp/udp port filter 9 22 flex port 10 (or) if set, packets can pass if match the tcp/udp port filter 10 23 dhcpv6 (or) if set, packets can pass if match the dhcpv6 port (0x0223) 24 dhcp client (or) if set, packets can pass if match the dhcp server port (0x0043) 25 dhcp server (or) if set, packets can pass if match the dhcp client port (0x0044) 26 netbios name service (or) if set, packets can pass if match the netbios name service port (0x0089) 27 netbios datagram service (or) if set, packets can pass if match the netbios datagram service port (0x008a) 28 flex tco 0 (or) if set, packets can pass if match the flex 128 tco filter 0 29 flex tco 1 (or) if set, packets can pass if match the flex 128 tco filter 1 30 flex tco 2 (or) if set, packets can pass if match the flex 128 tco filter 2 31 flex tco 3 (or) if set, packets can pass if match the flex 128 tco filter 3 table 10-38. extended filter 1 values bit # name description 0 ethertype 0x88f8 and filter 1 ethertype 0x8808 and filter 3:2 ethertype 2 -3 and filters 7:4 reserved reserved 8 ethertype 0x88f8 or filter 9 ethertype 0x8808 or filter 11:10 ethertype 2 -3 or filters 31:12 reserved reserved bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header table 10-37. filter values (continued)
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 819 if the extended decision filter index is bigger than 5, a command failed response code is returned with no reason. 10.6.2.7 get intel filters formats 10.6.2.7.1 get intel filters command (intel command 0x03) 10.6.2.7.2 get intel filters response (intel command 0x03) 10.6.2.7.3 get intel filters ? manageability to host command (intel command 0x03, filter parameter 0x0a) this command retrieves the mng2host register. the mng2host register controls whether pass-through packets destined to the mc are also be forwarded to the host os. 10.6.2.7.4 get intel filters ? manageability to host response (intel command 0x03, filter parameter 0x0a) 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x02 0x68 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x03 parameter number bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x03 parameter number optional return data bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x03 0x0a
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 820 the mng2host register has the following structure: \ 10.6.2.7.5 get intel filters ? flex filter 0 enable mask and length command (intel command 0x03, filter parameter 0x10/0x20/0x30/0x40) the following command retrieves the intel flex filters mask and length. use filter parameters 0x10/ 0x20/0x30/0x40 for flexible filters 0/1/2/3 accordingly. bits description default 0 decision filter 0 determines if packets that have passed decision filter 0 are also forwarded to the host os. 1 decision filter 1 determines if packets that have passed decision filter 1 are also forwarded to the host os. 2 decision filter 2 determines if packets that have passed decision filter 2 are also forwarded to the host os. 3 decision filter 3 determines if packets that have passed decision filter 3 are also forwarded to the host os. 4 decision filter 4 determines if packets that have passed decision filter 4 are also forwarded to the host os. 5 unicast and mixed determines if broadcast packets are also forwarded to the host os. 6 global multicast determines if unicast packets are also forwarded to the host os. 7 broadcast determines if multicast packets are also forwarded to the host os. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x03 0x0a manageability to host (3-2) 28..29 manageability to host (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x03 0x10/ 0x20/ 0x30/ 0x40
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 821 10.6.2.7.6 get intel filters ? flex filter 0 enable mask and length response (intel command 0x03, filter parameter 0x10/0x20/0x30/0x40) 10.6.2.7.7 get intel filters ? flex filter 0 data command (intel command 0x03, filter parameter 0x11/0x21/0x31/0x41) the following command retrieves the intel flex filters data. use filter parameters 0x11/0x21/0x31/0x41 for flexible filters 0/1/2/3 accordingly. 10.6.2.7.8 get intel filters ? flex filter 0 data response (intel command 0x03, filter parameter 0x11) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x03 0x10/ 0x20/ 0x30/ 0x40 mask byte 1 mask byte 2 28..31 .. .. .. .. 32..35 .. .. .. .. 36..39 .. .. .. .. 40..43 .. mask byte 16 reserved reserved 44 flexible filter length bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x03 0x11/ 0x21/ 0x31/ 0x41 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 822 10.6.2.7.9 get intel filters ? packet addition decision filter command (intel command 0x03, filter parameter 0x61) filter index range: 0x0..0x4. 10.6.2.7.10 get intel filters ? packet addition decision filter response (intel command 0x03, filter parameter 0x0a) 10.6.2.7.11 get intel filters ? flex tcp/udp port filter command (intel command 0x03, filter parameter 0x63) 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24.. 0x03 0x11/ 0x21/ 0x31/ 0x41 filter group number filter data 1 .. filter data n bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x03 0x61 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x03 0x61 decision filter (3-2) 28..29 decision filter (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157)
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 823 filter index range: 0x0..0xa. 10.6.2.7.12 get intel filters ? flex tcp/udp port filter response (intel command 0x03, filter parameter 0x63) filter index range: 0x0..0xa. 10.6.2.7.13 get intel filters ? ipv4 filter command (intel command 0x03, filter parameter 0x64) note: the filters index range can vary according to the ipv4/ipv6 mode setting in the filters control command. ipv4 mode: filter index range: 0x0..0x3. ipv6 mode: no ipv4 filters. 10.6.2.7.14 get intel filters ? ipv4 filter response (intel command 0x03, filter parameter 0x64) 20..22 0x03 0x63 tcp/udp filter index bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x03 0x63 tcp/udp filter index tcp/udp port (1) 28 tcp/udp port (0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..22 0x03 0x64 ipv4 filter index bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 824 10.6.2.7.15 get intel filters ? ipv6 filter command (intel command 0x03, filter parameter 0x65) note: the filters index range can vary according to the ipv4/ipv6 mode setting in the filters control command ipv4 mode: filter index range: 0x0..0x2. ipv6 mode: filter index range: 0x0..0x3. 10.6.2.7.16 get intel filters ? ipv6 filter response (intel command 0x03, filter parameter 0x65) 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x03 0x64 ipv4 filter index ipv4 address (3) 28..29 ipv4 address (2-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..22 0x03 0x65 ipv6 filter index bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x03 0x65 ipv6 filter index ipv6 address (msb, byte 16) 28..31 .. .. .. .. 32..35 .. .. .. .. 36..39 .. .. .. .. 40..42 .. .. ipv6 address (lsb, byte 0)
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 825 10.6.2.8 set intel packet reduction filters formats 10.6.2.8.1 set intel packet reduction filters command (intel command 0x04) 10.6.2.8.2 set intel packet reduction filters response (intel command 0x04) 10.6.2.8.3 set unicast packet reduction command (intel command 0x04, reduction filter index 0x00) this command causes the nc to filter packets that have passed due to the unicast filter (mac address filters, as specified in the dmtf nc-si). note that unicast filtering might be affected by other filters, as specified in the dmtf nc-si. the filtering of these packets are done such that the mc might add a logical condition that a packet must match, or it must be discarded. note: packets that might have been blocked can still pass due to other decision filters. in order to disable unicast packet reduction, the mc should set all reduction filters to 0b. following such a setting the nc must forward, to the mc, all packets that have passed the unicast filters (mac address filtering) as specified in the dmtf nc-si. the unicast packet reduction field has the following structure: bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x04 packet reduction index bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24.. 0x04 packet reduction index optional return data bit # name description 0 reserved 1 reserved
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 826 the filtering is divided into two decisions: ? bit 3 works in an and manner; it must be true in order for a packet to pass (if was set). ? bits 8-31 work in an or manner; at least one of them must be true for a packet to pass (if any were set). 2 reserved 3 ip address if set, all unicast packets must also match an ip filter. 4 reserved 5 reserved 6 reserved 7 reserved 8 arp response if set, all unicast packets must also match the arp response filter (any of the active filters). 9 reserved 10 port 0x298 if set, all unicast packets must also match a fixed tcp/udp port 0x298 filter. 11 port 0x26f if set, all unicast packets must also match a fixed tcp/udp port 0x26f filter. 12 flex port 0 if set, all unicast packets must also match the tcp/udp port filter 0. 13 flex port 1 if set, all unicast packets must also match the tcp/udp port filter 1. 14 flex port 2 if set, all unicast packets must also match the tcp/udp port filter 2. 15 flex port 3 if set, all unicast packets must also match the tcp/udp port filter 3. 16 flex port 4 if set, all unicast packets must also match the tcp/udp port filter 4. 17 flex port 5 if set, all unicast packets must also match the tcp/udp port filter 5. 18 flex port 6 if set, all unicast packets must also match the tcp/udp port filter 6. 19 flex port 7 if set, all unicast packets must also match the tcp/udp port filter 7. 20 flex port 8 if set, all unicast packets must also match the tcp/udp port filter 8. 21 flex port 9 if set, all unicast packets must also match the tcp/udp port filter 9. 22 flex port 10 if set, all unicast packets must also match the tcp/udp port filter 10. 23 reserved 24 reserved 25 reserved 26 reserved 27 reserved 28 flex tco 0 if set, all unicast packets must also match the flex 128 tco filter 0. 29 flex tco 1 if set, all unicast packets must also match the flex 128 tco filter 1. 30 flex tco 2 if set, all unicast packets must also match the flex 128 tco filter 2. 31 flex tco 3 if set, all unicast packets must also match the flex 128 tco filter 3.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 827 10.6.2.8.4 set unicast packet reduction response (intel command 0x04, reduction filter index 0x00) 10.6.2.8.5 set multicast packet reduction command (intel command 0x04, reduction filter index 0x01) this command causes the nc to filter packets that have passed due to the multicast filter (mac address filters, as specified in the dmtf nc-si). the filtering of these packets are done such that the mc might add a logical condition that a packet must match, or it must be discarded. note: packets that might have been blocked can still pass due to other decision filters. in order to disable multicast packet reduction, the mc should set all reduction filters to 0b. following such a setting, the nc must forward, to the mc, all packets that have passed the multicast filters (global multicast filtering) as specified in the dmtf nc-si. the multicast packet reduction field has the following structure: bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x04 0x00 unicast packet reduction (3-2) 24..25 unicast packet reduction (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x04 0x00 bit # name description 0 reserved reserved. 1 reserved 2 reserved 3 ip address if set, all multicast packets must also match an ip filter. 4 reserved 5 reserved
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 828 the filtering is divided into two decisions: bit 3 works in an and manner; it must be true in order for a packet to pass (if was set). bits 4, 5, and 7-31 work in an or manner; at least one of them must be true for a packet to pass (if any were set). 6 reserved 7 reserved 8 arp response if set, all multicast packets must also match the arp response filter (any of the active filters). 9 reserved 10 port 0x298 if set, all multicast packets must also match a fixed tcp/udp port 0x298 filter. 11 port 0x26f if set, all multicast packets must also match a fixed tcp/udp port 0x26f filter. 12 flex port 0 if set, all multicast packets must also match the tcp/udp port filter 0. 13 flex port 1 if set, all multicast packets must also match the tcp/udp port filter 1. 14 flex port 2 if set, all multicast packets must also match the tcp/udp port filter 2. 15 flex port 3 if set, all multicast packets must also match the tcp/udp port filter 3. 16 flex port 4 if set, all multicast packets must also match the tcp/udp port filter 4. 17 flex port 5 if set, all multicast packets must also match the tcp/udp port filter 5. 18 flex port 6 if set, all multicast packets must also match the tcp/udp port filter 6. bit # name description 19 flex port 7 if set, all multicast packets must also match the tcp/udp port filter 7. 20 flex port 8 if set, all multicast packets must also match the tcp/udp port filter 8. 21 flex port 9 if set, all multicast packets must also match the tcp/udp port filter 9. 22 flex port 10 if set, all multicast packets must also match the tcp/udp port filter 10. 23 reserved 24 reserved 25 reserved 26 reserved 27 reserved 28 flex tco 0 if set, all multicast packets must also match the flex 128 tco filter 0. 29 flex tco 0 if set, all multicast packets must also match the flex 128 tco filter 1. 30 flex tco 0 if set, all multicast packets must also match the flex 128 tco filter 2. 31 flex tco 0 if set, all multicast packets must also match the flex 128 tco filter 3.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 829 10.6.2.8.6 set multicast packet reduction response (intel command 0x04, reduction filter index 0x01) 10.6.2.8.7 set broadcast packet reduction command (intel command 0x04, reduction filter index 0x02) this command causes the nc to filter packets that have passed due to the broadcast filter (mac address filters, as specified in the dmtf nc-si). the filtering of these packets are done such that the mc might add a logical condition that a packet must match, or it must be discarded. note: packets that might have been blocked can still pass due to other decision filters. in order to disable broadcast packet reduction, the mc should set all reduction filters to 0b. following such a setting, the nc must forward, to the mc, all packets that have passed the broadcast filters as specified in the dmtf nc-si. the broadcast packet reduction field has the following structure: bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x04 0x01 multicast packet reduction (3-2) 24..25 multicast packet reduction (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x04 0x01 bit # name description 0 reserved reserved. 1 reserved 2 reserved 3 ip address if set, all broadcast packets must also match an ip filter. 4 reserved 5 reserved
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 830 the filtering is divided into two decisions: bit 3 works in an and manner; it must be true in order for a packet to pass (if was set). bits 4, 5, and 7-31 work in an or manner; at least one of them must be true for a packet to pass (if any were set). 6 reserved 7 reserved 8 arp response if set, all broadcast packets must also match the arp response filter (any of the active filters). 9 reserved 10 port 0x298 if set, all broadcast packets must also match a fixed tcp/udp port 0x298 filter. 11 port 0x26f if set, all broadcast packets must also match a fixed tcp/udp port 0x26f filter. 12 flex port 0 if set, all broadcast packets must also match the tcp/udp port filter 0. 13 flex port 1 if set, all broadcast packets must also match the tcp/udp port filter 1. 14 flex port 2 if set, all broadcast packets must also match the tcp/udp port filter 2. 15 flex port 3 if set, all broadcast packets must also match the tcp/udp port filter 3. 16 flex port 4 if set, all broadcast packets must also match the tcp/udp port filter 4. 17 flex port 5 if set, all broadcast packets must also match the tcp/udp port filter 5. 18 flex port 6 if set, all broadcast packets must also match the tcp/udp port filter 6. 19 flex port 7 if set, all broadcast packets must also match the tcp/udp port filter 7. 20 flex port 8 if set, all broadcast packets must also match the tcp/udp port filter 8. 21 flex port 9 if set, all broadcast packets must also match the tcp/udp port filter 9. 22 flex port 10 if set, all broadcast packets must also match the tcp/udp port filter 10. 23 reserved 24 reserved 25 reserved 26 reserved 27 reserved 28 flex tco 0 if set, all broadcast packets must also match the flex 128 tco filter 0. 29 flex tco 0 if set, all broadcast packets must also match the flex 128 tco filter 1. 30 flex tco 0 if set, all broadcast packets must also match the flex 128 tco filter 2. 31 flex tco 0 if set, all broadcast packets must also match the flex 128 tco filter 3.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 831 10.6.2.8.8 set broadcast packet reduction response (intel command 0x08) 10.6.2.9 get intel packet reduction filters formats 10.6.2.9.1 get intel packet reduction filters command (intel command 0x05) 10.6.2.9.2 set intel packet reduction filters response (intel command 0x05) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x04 0x02 broadcast packet reduction (3-2) 24..25 broadcast packet reduction (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x04 0x02 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x05 reduction filter index bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 832 10.6.2.9.3 get unicast packet reduction command (intel command 0x05, reduction filter index 0x00) this command causes the nc to disable any packet reductions for unicast address filtering. 10.6.2.9.4 get unicast packet reduction response (intel command 0x05, reduction filter index 0x00) 10.6.2.9.5 get multicast packet reduction command (intel command 0x05, reduction filter index 0x01) 10.6.2.9.6 get multicast packet reduction response (intel command 0x05, reduction filter index 0x01) 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24.. 0x05 reduction filter index optional return data bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x05 0x00 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x05 0x00 unicast packet reduction (3-2) 28..29 unicast packet reduction (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x05 0x01
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 833 10.6.2.9.7 get broadcast packet reduction command (intel command 0x05, reduction filter index 0x02) 10.6.2.9.8 get broadcast packet reduction response (intel command 0x05, reduction filter index 0x02) 10.6.2.10 system mac address 10.6.2.10.1 get system mac address command (intel command 0x06) in order to support a system configuration that requires the nc to hold the mac address for the mc (such as shared mac address mode), the following command is provided to enable the mc to query the nc for a valid mac address. the nc must return the system mac addresses. the mc should use the returned mac addressing as a shared mac address by setting it using the set mac address command as defined in nc-si 1.0. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x05 0x00 multicast packet reduction (3-2) 28..29 multicast packet reduction (1-0) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x05 0x02 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x05 0x00 broadcast packet reduction (3-2) 28..29 broadcast packet reduction (1-0)
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 834 it is also recommended that the mc use packet reduction and manageability-to-host command to set the proper filtering method. 10.6.2.10.2 get system mac address response (intel command 0x06) 10.6.2.11 set intel management control formats 10.6.2.11.1 set intel management control command (intel command 0x20) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20 0x06 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x06 mac address 28..30 mac address bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..22 0x20 0x00 intel management control 1
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 835 where intel management control 1 is as follows: 10.6.2.11.2 set intel management control response (intel command 0x20) 10.6.2.12 get intel management control formats 10.6.2.12.1 get intel management control command (intel command 0x21) where intel management control 1 is as described in section 10.6.2.11.2 . 10.6.2.12.2 get intel management control response (intel command 0x21) bit # default value description 0 0b enable critical session mode (keep phy link up and veto bit) 0b ? disabled 1b ? enabled when critical session mode is enabled, the following behaviors are disabled: ? the phy is not reset on pe_rst# and pcie resets (in-band and link drop). other reset events are not affected ? internal_power_on_reset, device disable, force tco, and phy reset by software. ? the phy does not change its power state. as a result link speed does not change. ? the device does not initiate configuration of the phy to avoid losing link. 1?7 0x0 reserved bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x20 0x00 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x21 0x00
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 836 10.6.2.13 tco reset this command causes the nc to perform tco reset, if force tco reset is enabled in the nvm. if the mc has detected that the operating system is hung and has blocked the rx/tx path, the force tco reset clears the data-path (rx/tx) of the nc to enable the mc to transmit/receive packets through the nc. when this command is issued to a channel in a package, it applies only to the specific channel. after successfully performing the command, the nc considers the force tco command as an indication that the operating system is hung and clears the drv_load flag (disable the lan device driver). 10.6.2.13.1 perform intel tco reset command (intel command 0x22) where tco mode is: bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..26 0x21 0x00 intel management control 1 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20 0x22 tco mode field bit(s) description do_tco_rst 0 perform tco reset. 0b: do nothing. 1b: perform tco reset. reserved 1:1 reserved (set to 0).
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 837 10.6.2.13.2 perform intel tco reset response (intel command 0x22) 10.6.2.14 checksum offloading this command enables the checksum offloading filters in the nc. when enabled, these filters block any packets that did not pass ip, udp and tcp checksums from being forwarded to the mc. 10.6.2.14.1 enable checksum offloading command (intel command 0x23) 10.6.2.14.2 enable checksum offloading response (intel command 0x23) reset_mgmt 2 reset manageability; re-load manageability eeprom words. 0b = do nothing 1b = issue firmware reset to manageability setting this bit generates a one-time firmware reset. following the reset, management related data from eeprom is loaded. reserved 7:3 reserved (set to 0x00). note: for compatibility, the tco reset command without the tco mode parameter is accepted (tco reset is performed). bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..26 0x22 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20 0x23
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 838 10.6.2.14.3 disable checksum offloading command (intel command 0x24) 10.6.2.14.4 disable checksum offloading response (intel command 0x24) 10.6.2.15 macsec control commands format (intel command 0x30) the following commands may be used by the mc to control the different aspects of the macsec engine. 10.6.2.15.1 transfer macsec ownership to mc command (intel command 0x30, parameter 0x10) this command shall cause intel? 82576 gbe controller to clear all macsec parameters, forcefully release host ownership and grant the ownership to the bmc the mc may allow the host to use the bmc?s key for traffic by setting the ?host control ? allow host traffic? bit. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..26 0x23 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20 0x24 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..26 0x24
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 839 if the ownership of the macsec was previously set to the host, activating this command will clear all the macsec parameters.otherwise, only the ?allow host traffic: bit is affected by this command. 10.6.2.15.2 transfer macsec ownership to mc response (intel command 0x30, parameter 0x10) 10.6.2.15.3 transfer macsec ownership to host command (intel command 0x30, parameter 0x11) if the mc is the owner of macsec, this command shall cause intel? 82576 gbe controller to clear all macsec parameters, release mc ownership and grant ownership to the host. in this scenario traffic from/to the mc shall be validated by the host?s programmed keys. it is recommended that the mc will try to establish network communication with a remote station to verify that the host was successful in programming the keys. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..22 0x30 0x10 host control table 10-39. macsec host control status: bytes description 0 reserved 1 allow host traffic: 0b ? host traffic is blocked 1b ? host traffic is allowed 2..7 reserved bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x10 bits bytes 31..24 23..16 15..08 07..00
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 840 10.6.2.15.4 transfer macsec ownership to host response (intel command 0x30, parameter 0x11) 10.6.2.15.5 initialize macsec rx command (intel command 0x30, parameter 0x12) this command may be used by the mc to initialize the macsec rx engine. this command should be followed by a ?set macsec rx key? command to establish a macsec environment. where: ? rx port identifier ? the port number by which the nc will identify rx packets. it is recommended that the mc uses 0x0 as the port identifier. note: the mc should use the same port identifier when performing the key-exchange. ? rx sci ? a 6 bytes unique identifier for the macsec tx ca. it is recommended that the mc uses its mac address value for this field. 10.6.2.15.6 initialize macsec rx response (intel command 0x30, parameter 0x12) 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x30 0x11 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x11 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x30 0x12 rx port identifier 24..27 rx sci [0..3] 28..29 rx sci [4..5]
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 841 10.6.2.15.7 initialize macsec tx command (intel command 0x30, parameter 0x13) this command may be used by the mc to initialize the macsec tx engine. this command should be followed by a ?set macsec tx key? command to establish a macsec environment. ? tx port identifier ? for this implementation this field is a ?don?t care? and is automatically set to 0x0. ? tx sci ? a 6 bytes unique identifier for the macsec tx ca. it is recommended that the mc uses its mac address value for this field. ? pn threshold ? when a new key is programmed, the packet number is reset to 0x1. with each tx packet, the packet number increments by 1 and is inserted to the packet (to avoid replay attacks). the pn threshold value is the 3 msbytes of the tx packet number after which a ?key exchange required? aen will be sent to the bmc, if enabled. see section 10.6.2.16 for details of the aen. example: a pn threshold of 0x123456 means that when the pn reaches 0x123456ff a notification will be sent. the fourth byte of the pn threshold can be seen as a reserved bit, because it will always be treated as 0xff by the nc. note: if the pn threshold is less than 0x100, the pn threshold will be set to a default of 0x4000. ? tx control: bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x12 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x30 0x13 tx port identifier 24..27 tx sci [0..3] 28..31 tx sci [4..5] reserved 32..35 packet number threshold 36 tx control
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 842 10.6.2.15.8 initialize macsec tx response (intel command 0x30, parameter 0x13) 10.6.2.15.9 set macsec rx key command (intel command 0x30, parameter 0x14) this command may be used by the mc to set a new macsec rx key. upon receiving this command the nc shall switch to the new rx key and send the response. where: ? rx sa an ? the association number to be used with this key. ? rx macsec key ? the 128 bits (16 bytes) key to be used for rx rx sa an value range is 0..3 bytes description 0..4 reserved 5 always include sci in tx: 0b ? do not include sci in tx packets 1b ? include sci in tx packets 6..7 reserved bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x13 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x30 0x14 reserved rx sa an 24..27 rx macsec key msb .. .. .. 28..31 .. .. .. .. 32..35 .. .. .. .. 36..39 .. .. .. rx macsec key lsb
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 843 sa an value must be different between two successive set macsec rx key commands. 10.6.2.15.10 set macsec rx key response (intel command 0x30, parameter 0x14) if rx sa an value is bigger than 3 or its value is the same as previous set macsec rx key commands, a command failed response code is returned with no reason. 10.6.2.15.11 set macsec tx key command (intel command 0x30, parameter 0x15) this command may be used by the mc to set a new macsec tx key. upon receiving this command the nc shall switch to the new tx key and send the response. where: ? tx sa an ? the association number to be used with this key. ? tx macsec key ? the 128 bits (16 bytes) key to be used for tx tx sa an value range is 0..3. 10.6.2.15.12 set macsec tx key response (intel command 0x30, parameter 0x15) bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x14 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..23 0x30 0x15 reserved tx sa an 24..27 tx macsec key msb .. .. .. 28..31 .. .. .. .. 32..35 .. .. .. .. 36..39 .. .. .. tx macsec key lsb
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 844 if tx sa an value is bigger than 3, a command failed response code is returned with no reason. 10.6.2.15.13 enable network tx encryption command (intel command 0x30, parameter 0x16) this command may be used by the mc to (re)enable encryption of outgoing pass-through packets. after this command is issued and until a response is received, the state of any outgoing packets is undetermined. by default network tx encryption is enabled. mode: ? 0: authentication only. ? 1: encryption and authentication. 10.6.2.15.14 enable network tx encryption response (intel command 0x30, parameter 0x16) following sending this response the nc shall stop encrypting outgoing pass-through packets. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x15 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..22 0x30 0x16 mode bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x16
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 845 10.6.2.15.15 disable network tx encryption command (intel command 0x30, parameter 0x17) this command may be used by the mc to disable encryption of outgoing pass-through packets. after this command is issued and until a response is received, the state of any outgoing packets is undetermined. 10.6.2.15.16 disable network tx encryption response (intel command 0x30, parameter 0x17) following sending this response the nc shall start encrypting outgoing pass-through packets. 10.6.2.15.17 enable network rx decryption command (intel command 0x30, parameter 0x18) this command may be used by the mc to (re)enable decryption of incoming pass-through packets. this will cause the nc to execute macsec offload and to post the frames to the mc (or host) only if the macsec operation succeeds. after this command is issued and until a response is received, the state of any incoming packets is undetermined. by default network rx decryption is disabled. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x30 0x17 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x17 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x30 0x18
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 846 10.6.2.15.18 enable network rx decryption response (intel command 0x30, parameter 0x18) following sending this response the nc shall begin decrypting incoming pass-through packets. 10.6.2.15.19 disable network rx decryption command (intel command 0x30, parameter 0x19) this command may be used by the mc to disable decryption of incoming pass-through packets. after this command is issued and until a response is received, the state of any incoming packets is undetermined. 10.6.2.15.20 disable network rx decryption response (intel command 0x30, parameter 0x19) following sending this response the nc shall stop decrypting incoming pass-through packets. 10.6.2.15.21 get macsec parameters format (intel command 0x31) the following commands may be used by the mc to retrieve the different macsec parameters. these commands responses are valid only if the mc owns the macsec. bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x18 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..21 0x30 0x19 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..25 0x30 0x19
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 847 10.6.2.15.22 get macsec rx parameters command (intel command 0x31, parameter 0x01) 10.6.2.15.23 get macsec rx parameters response (intel command 0x31, parameter 0x01) this command allows the mc to retrieve the currently configured set of rx macsec parameter. where: bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..22 0x31 0x01 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x31 0x01 reserved 28..31 macsec owner status macsec host control status rx port identifier 32..35 sci [0..3] 36..39 sci [4..5] reserved rx sa an 40..43 rx sa packet number table 10-40. macsec owner status bytes description 0x0 host is macsec owner 0x1 bmc is macsec owner table 10-41. macsec host control status bytes description
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 848 ? rx port identifier ? the rx port identifier ? rx sci ? the rx sci identifier. ? rx sa an ? the association number associated with the active sa (for which the last valid rx macsec packet was received). ? rx sa packet number ? is the last packet number, as read from the last valid rx macsec packet. 10.6.2.15.24 get macsec tx parameters command (intel command 0x31, parameter 0x02) this command allows the mc to retrieve the currently configured set of tx macsec parameter. 10.6.2.15.25 get macsec tx parameters response (intel command 0x31, parameter 0x02) 0 reserved 1 allow host traffic: 0b- host traffic is blocked 1b ? host traffic is allowed 2..7 reserved description bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 manufacturer id (intel 0x157) 20..22 0x31 0x02 bits bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 16..19 response code reason code 20..23 manufacturer id (intel 0x157) 24..27 0x31 0x2 reserved 28..31 macsec owner status macsec host control status tx port identifier 32..35 sci [0..3] 36..39 sci [4..5] reserved tx sa an 40..43 tx sa packet number 44.47 packet number threshold 48 tx control status table 10-41. macsec host control status
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 849 where: ? tx port identifier ? reserved to 0x0 for this implementation. ? tx sci ? the rx sci identifier. ? tx sa an ? the association number currently used for the active sa. ? tx sa packet number ? is the last packet number, as read from the last valid rx macsec packet. ? packet number threshold. 10.6.2.16 macsec aen (intel aen 0x80) the following is the aen that may be sent by the nc following a macsec event. this aen must be enabled using the nc-si ?aen enable? command, using bit 16 (0x10000) of the aen enable mask. table 10-42. macsec owner status: value description 0x0 host is macsec owner 0x1 bmc is macsec owner table 10-43. macsec host control status bytes description 0 reserved 1 allow host traffic: 0b- host traffic is blocked 1b ? host traffic is allowed 2..7 reserved table 10-44. tx control status: bytes description 0..4 reserved 5 include sci: 0b ? do not include sci in tx packets 1b ? include sci in tx packets 6..7 reserved description bytes 31..24 23..16 15..08 07..00 00..15 nc-si header 20..23 reserved 0x80 24..27 reserved macsec event cause
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 850 where: macsec event cause has the following format: 10.6.3 basic nc-si workflows 10.6.3.1 package states a nc package can be in one of the following two states: 1. selected ? the package is allowed to use the nc-si lines, meaning the nc package might send data to the mc. 2. de-selected ? the package is not allowed to use the nc-si lines, meaning, the nc package cannot send data to the mc. the mc must select no more than one nc package at any given time. package selection can be accomplished in one of two methods: 1. select package command ? this command explicitly selects the nc package. 2. any other command targeted to a channel in the package also implicitly selects that nc package. package de-select can be accomplished only by issuing the de-select package command. the mc should always issue the select package command as the first command to the package before issuing channel-specific commands. for further details on package selection, refer to the nc-si specification. 10.6.3.2 channel states a nc channel can be in one of the following states: 1. initial state ? the channel only accepts the clear initial state command (the package also accepts the select package and de-select package commands). 2. active state ? this is the normal operational mode. all commands are accepted. for normal operation mode, the mc should always send the clear initial state command as the first command to the channel. 10.6.3.3 discovery after interface power-up, the mc should perform a discovery process to discover the ncs that are connected to it. this process should include an algorithm similar to the following: bytes description 0 host requested ownership 1 host released ownership 2 tx key packet number (pn) threshold met 3 reserved 4 macsec configuration lost. 5..7 reserved
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 851 1. for package_id=0x0 to max_package_id a. issue select package command to package id package_id b. if a response was received then for internal_channel_id = 0x0 to max_internal_channel_id issue a clear initial state command for package_id | internal_channel_id (the combination of package_id and internal_channel_id to create the channel id). if a response was received then consider internal_channel_id as a valid channel for the package_id package the mc can now optionally discover channel capabilities and version id for the channel else (if not a response was not received, then issue a clear initial state command three times. issue a de-select package command to the package (and continue to the next package). c. else, if a response was not received, issue a select packet command three times. 10.6.3.4 configurations this section details different configurations that should be performed by the mc. it is good practice that the mc not consider any configuration valid unless the mc has explicitly configured it after every reset (entry into the initial state). as a result, it is recommended that the mc re-configure everything at power-up and channel/package resets. 10.6.3.4.1 nc capabilities advertisement nc-si defines the get capabilities command. it is recommended that the mc use this command and verify that the capabilities match its requirements before performing any configurations. for example, the mc should verify that the nc supports a specific aen before enabling it. 10.6.3.4.2 receive filtering in order to receive traffic, the mc must configure the nc with receive filtering rules. these rules are checked on every packet received on the lan interface (such as from the network). only if the rules matched, will the packet be forwarded to the mc. 10.6.3.4.2.1 mac address filtering nc-si defines three types of mac address filters: unicast, multicast and broadcast. to be received (not dropped) a packet must match at least one of these filters. the mc should set one mac address using the set mac address command and enable broadcast and global multicast filtering. unicast/exact match (set mac address command) this filter filters on specific 48-bit mac addresses. the mc must configure this filter with a dedicated mac address.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 852 the nc might expose three types of unicast/exact match filters (such as mac filters that match on the entire 48 bits of the mac address): unicast, multicast and mixed. the 82576 exposes two mixed filters, which might be used both for unicast and multicast filtering. the mc should use one mixed filter for its mac address. refer to nc-si specification ? set mac address for further details. broadcast (enable/disable broadcast filter command) nc-si defines a broadcast filtering mechanism which has the following states: 1. enabled ? all broadcast traffic is blocked (not forwarded) to the mc, except for specific filters (such as arp request, dhcp, and netbios). 2. disabled ? all broadcast traffic is forwarded to the mc, with no exceptions. refer to nc-si specification enable/disable broadcast filter command. global multicast (enable/disable global multicast filter) nc-si defines a multicast filtering mechanism which has the following states: 1. enabled ? all multicast traffic is blocked (not forwarded) to the mc. 2. disabled ? all multicast traffic is forwarded to the mc, with no exceptions. the recommended operational mode is enabled, with specific filters set. not all multicast filtering modes are necessarily supported. refer to nc-si specification enable/disable global multicast filter command for further details. 10.6.3.4.3 vlan nc-si defines the following vlan work modes: refer to nc-si specification ? enable vlan command for further details. the 82576 only supports modes #1 and #3. recommendation: 1. modes: a. if vlan is not required ? use the disabled mode. b. if vlan is required ? use the enabled #1 mode. 2. if enabling vlan, the mc should also set the active vlan id filters using the nc-si set vlan filter command prior to setting the vlan mode. mode command and name descriptions disabled disable vlan command in this mode, no vlan frames are received. enabled #1 enable vlan command with vlan only in this mode, only packets that matched a vlan filter are forwarded to the mc. enabled #2 enable vlan command with vlan only + non- vlan in this mode, packets from mode 1 + non-vlan packets are forwarded. enabled #3 enable vlan command with any-vlan + non- vlan in this mode, packets are forwarded regardless of their vlan state.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 853 10.6.3.5 pass-through traffic states the mc has independent, separate controls for enablement states of the receive (from lan) and of the transmit (to lan) pass-through paths. 10.6.3.6 channel enable this mode controls the state of the receive path: 1. disabled ? the channel does not pass any traffic from the network to the mc. 2. enabled ? the channel passes any traffic from the network (that matched the configured filters) to the mc. this state also affects aens: aens is only sent in the enabled state. the default state is disabled. it is recommended that the mc complete all filtering configuration before enabling the channel. 10.6.3.7 network transmit enable this mode controls the state of the transmit path: 1. disabled ? the channel does not pass any traffic from the mc to the network. 2. enabled ? the channel passes any traffic from the mc (that matched the source mac address filters) to the network. the default state is disabled. the nc filters pass-through packets according to their source mac address. the nc tries to match that source mac address to one of the mac addresses configured by the set mac address command. as a result, the mc should enable network transmit only after configuring the mac address. it is recommended that the mc complete all filtering configuration (especially mac addresses) before enabling the network transmit. this feature can be used for fail-over scenarios. see section 10.6.7.5 . 10.6.4 asynchronous event notifications the asynchronous event notifications are unsolicited messages sent from the nc to the mc to report status changes (such as link change, operating system state change, etc.). recommendations: ? the mc firmware designer should use aens. to do so, the designer must take into account the possibility that a nc-si response frame (such as a frame with the nc-si ethertype), arrives out-of- context (not immediately after a command, but rather after an out-of-context aen). ? to enable aens, the mc should first query which aens are supported, using the get capabilities command, then enable desired aen(s) using the enable aen command, and only then enable the channel using the enable channel command.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 854 10.6.5 querying active parameters the mc can use the get parameters command to query the current status of the operational parameters. 10.6.6 resets in nc-si there are two types of resets defined: 1. synchronous entry into the initial state. 2. asynchronous entry into the initial state. recommendations: ? it is very important that the mc firmware designer keep in mind that following any type of reset, all configurations are considered as lost and thus the mc must re-configure everything. ? as an asynchronous entry into the initial state might not be reported and/or explicitly noticed, the mc should periodically poll the nc with nc-si commands (such as get version id, get parameters, etc.) to verify that the channel is not in the initial state. should the nc channel respond to the command with a clear initial state command expected reason code, the mc should consider the channel (and most probably the entire nc package) as if it underwent a (possibly unexpected) reset event. thus, the mc should re-configure the nc. see the nc-si specification section on detecting pass-through traffic interruption. ? the intel recommended polling interval is 2-3 seconds. for exact details on the resets, refer to nc-si specification. 10.6.7 advanced workflows 10.6.7.1 multi-nc arbitration as described in section 10.6.1.2 , in a multi-nc environment, there is a need to arbitrate the nc-si lines.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 855 figure 10-7 shows the system topology of such an environment. see figure 10-7 . the nc-si rx lines are shared between the ncs. to enable sharing of the nc-si rx lines, nc-si has defined an arbitration scheme. the arbitration scheme mandates that only one nc package can use the nc-si rx lines at any given time. the nc package that is allowed to use these lines is defined as selected. all the other nc packages are de-selected. nc-si has defined two mechanisms for the arbitration scheme: 1. package selection by the mc. in this mechanism, the mc is responsible for arbitrating between the packages by issuing nc-si commands (select/de-select package). the mc is responsible for having only one package selected at any given time. 2. hardware arbitration. in this mechanism, two additional pins on each nc package are used to synchronize the nc package. each nc package has an arb_in and arb_out line and these lines are used to transfer tokens. a nc package that has a token is considered selected. comment: hardware arbitration is enabled by a nvm configuration. for details, refer to the nc-si specification. 10.6.7.2 package selection sequence example following is an example work flow for a mc and occurs after the discovery, initialization, and configuration. assuming the mc needs to share the nc-si bus between packages, the mc should: 1. define a time-slot for each device. 2. discover, initialize, and configure all the nc packages and channels. 3. issue a de-select package command to all the channels. figure 10-7. multi-nc environment
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 856 4. set active_package to 0x0 (or the lowest existing package id). 5. at the beginning of each time slot the mc should: a. issue a de-select package to the active_package. the mc must then wait for a response and then an additional timeout for the package to become de-selected (200 ? s). see the nc-si specification table 10 ? parameter nc deselect to hi-z interval. b. find the next available package (typically active_package = active_package + 1). c. issue a select package command to active_package. 10.6.7.3 external link control the mc can use the nc-si set link command to control the external interface link settings. this command enables the mc to set the auto-negotiation, link speed, duplex, and other parameters. this command is only available when the host operating system is not present. indicating the host operating system status can be obtained via the get link status command and/or host os status change aen command. recommendation: ? unless explicitly needed, it is not recommended to use this feature. the nc-si set link command does not expose all the possible link settings and/or features. this might cause issues under different scenarios. even if you decided to use this feature, use it only if the link is down (trust the 82576 until proven otherwise). ? it is recommended that the mc first query the link status using the get link status command. the mc should then use this data as a basis and change only the needed parameters when issuing the set link command. for details, refer to the nc-si specification. 10.6.7.4 set link while lan pcie functionality is disabled in cases where the 82576 is used solely for manageability and its lan pcie function is disabled, using the nc-si set link command while advertising multiple speeds and enabling auto-negotiation results in the lowest possible speed chosen. to enable link of higher a speed, the mc should not advertise speeds that are below the desired link speed, as the lowest advertised link speed is chosen. when the 82576 is only used for manageability and the link speed advertisement is configured by the mc, changes in the power state of the lan device is not effected and the link speed is not re-negotiated by the lan device. 10.6.7.5 multiple channels (fail-over) in order to support a fail-over scenario, it is required from the mc to operate two or more channels. these channels might or might not be in the same package. the key element of a fault-tolerance fail-over scenario is having two (or more) channels identifying to the switch with the same mac address, but only one of them being active at any given time (such as switching the mac address between channels). to accomplish this, nc-si provides the following commands:
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 857 1. enable network tx command ? this command enables shutting off the network transmit path of a specific channel. this enables the mc to configure all the participating channels with the same mac address but only enable one of them. 2. link status change aen or get link status command. 10.6.7.5.1 fail-over algorithm example the following is a sample workflow for a fail-over scenario for the 82576 dual-port gbe controller (one package and two channels): 1. mc initializes and configures both channels after power-up. however, the mc uses the same mac address for both of the channels. 2. the mc queries the link status of all the participating channels. the mc should continuously monitor the link status of these channels. this can be accomplished by listening to aens (if used) and/or periodically polling using the get link status command. 3. the mc then only enables channel 0 for network transmission. 4. the mc then issues a gratuitous arp (or any other packet with its source mac address) to the network. this packet informs the switch that this specific mac address is registered to channel 0's specific lan port. 5. the mc begins normal workflow. 6. should the mc receive an indication (aen or polling) that the link status for the active channel (channel 0) has changed, the mc should: a. disable channel0 for network transmission. b. check if a different channel is available (link is up). c. if found: ? enable network tx for that specific channel. ? issue a gratuitous arp (or any other packet with its source mac address) to the network. this packet informs the switch that this specific mac address is registered to channel 0's specific lan port. ? resume normal workflow. ? if not found, report the error and continue polling until a valid channel is found. the above algorithm can be generalized such that the start-up and normal workflow are the same. in addition, the mc might need to use a specific channel (such as channel 0). in this case, the mc should switch the network transmit to that specific channel as soon as that channel becomes valid (link is up). recommendations: ? wait for a link-down-tolerance timeout before a channel is considered invalid. for example, a link re-negotiation might take a few seconds (normally 2 to 3 or might be up to 9). thus, the link must be re-established after a short time. ? typically, this timeout is recommended to be three seconds. ? even when enabling and using aens, periodically poll the link status, as dropped aens might not be detected. 10.6.7.6 statistics the mc might use the statistics commands as defined in nc-si. these counters are meant mostly for debug purposes and are not all supported.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 858 the statistics are divided into three commands: 1. controller statistics ? these are statistics on the primary interface (to the host operating system). see the nc-si specification for details. 2. nc-si statistics ? these are statistics on the nc-si control frames (such as commands, responses, aens, etc.). see the nc-si specification for details. 3. nc-si pass-through statistics ? these are statistics on the nc-si pass-through frames. see the nc-si specification for details. 10.7 manageability host interface this section details host interaction with the manageability portion of the 82576. the information within this section is only avaiable to the host driver, the mc does not have access. 10.7.1 host csr interface (function 1/0) the software device driver of function 0/1 communicates with the manageability block through csr access. the manageability is mapped to address space 0x8800 to 0x8fff on the slave bus of each function. note: writing to address 0x8800 from function 0 or from function 1 is targeted to the same address in the ram. 10.7.2 host slave command interface to manageability this interface is used by the software device driver for several of the commands and for delivering various types of data in both directions (manageability-to-host and host-to-manageability). the address space is separated into two areas: ? direct access to the internal arc data ram: the internal data ram is mapped to address space 0x8800 to 0x8eff. writing/reading to this address space goes directly to the ram. ? control registers are located at address 0x8f00. 10.7.3 host slave command interface low level flow this interface is used for the external host software to access the manageability subsystem. host software writes a command block or read data structure directly from the data ram. host software controls these transactions through a slave access to the control register. the following flow shows the process of initiating a command to the manageability block: 1. the software device driver reads the control register and checks that the enable bit is set. 2. the software device driver writes the relevant command block into the ram area. 3. the software device driver sets the command bit in the control register. setting this bit causes an interrupt to the arc (can be masked). 4. the software device driver polls the control register for the command bit to be cleared by hardware. 5. when manageability finishes with the command, it clears the command bit (if the manageability should reply with data, it should clear the bit only after the data is in the ram area where the software device driver can read it).
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 859 if the software device driver reads the control register and the sv bit is set, then there is a valid status of the last command in the ram. if the sv bit is not set, then the command has failed with no status in the ram. 10.7.4 host slave command registers 10.7.4.1 host interface control register (csr address 0x8f00; aux 0x0700) this register operates along with the host software/firmware interface. 10.7.4.2 firmware status 0 (fws0r) register (csr address 0x8f0c; aux 0x0702) this register operates along with the host software/firmware interface. 10.7.4.3 software status register (csr address 0x8f10; aux 0x0703) this register operates along with the host software/firmware interface. 10.7.5 host interface command structure table 10-45 describes the structure used by the host driver to send a command to manageability firmware using the host interface slave command interface. table 10-45. host driver command structure #byte description bit value description 0 command 7:0 command dependent specifies which host command to process. 1 buffer length 7:0 command length command data buffer length: 0 to 252, not including 32 bits of header. 2 default/implicit interface 0 command dependent used for commands might refer to one of two interfaces (lan or smbus). 0b = use default interface. 1b = use specific interface. interface number 1 command dependent used when bit 0 (default/implicit interface) is set: 0b = apply command for interface 0. 1b = apply command for interface 1. when bit 0 is set to 0b, it is ignored. reserved 7:2 0x0 reserved 3 checksum 7:0 defined below checksum signature. 255:4 data buffer 7:0 command dependent command specific data minimum buffer size: 0. maximum buffer size: 252.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 860 10.7.6 host interface status structure table 10-46 lists the structure used by manageability firmware to return a status to the host driver via the host interface slave command interface. a status is returned after a command has been executed. 10.7.7 checksum calculation algorithm the host command/status structure is summed with this field cleared to 0b. the calculation is done using 8-bit unsigned math with no carry. the inverse of this sum is stored in this field (0b minus the result). result: the current sum of this buffer (8-bit unsigned math) is 0b. 10.7.8 host slave interface commands the host interface command that is supported is the fail-over configuration command (besides debug commands that will not be described in this document). 10.7.9 fail-over configuration host command this command is used to update the fail-over configuration register. table 10-46. status structure returned to host driver #byte description bit value description 0 command 7:0 command dependent command id. 1 buffer length 7:0 status dependent status buffer length: 252:0 2 return status 7:0 depends on command executing results 0x1 status ok 0x2 illegal command id 0x3 unsupported command 0x4 illegal payload length 0x5 checksum failed 0x6 data error 0x7 invalid parameter 0x8 - 0xff reserved 3 checksum 7:0 defined below checksum signature. 255:4 data buffer status dependent status configuration parameters minimum buffer size: 0. maximal buffer size: 252. table 10-47. command for updating fail-over configuration register byte name bit value description 0 command 7:0 0x26 fail-over configuration command. 1 buffer length 7:0 0x4 four bytes of the fail-over configuration register.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 861 following is the status returned on this command: 10.7.10 read fail-over configuration host command this command is used to read the fail-over configuration register. following is the status returned on this command: 2 7:0 0x0 3 checksum 7:0 checksum signature of the host command. 7:4 data buffer 7:0 fail-over configuration dwords fail-over register value. byte 4 is byte 0 of the configuration register. table 10-48. status returned byte name bit value description 0 command 7:0 0x26 fail-over configuration command 1 buffer length 7:0 0x0 no data in return status 2 return status 7:0 0x1 0x1 for good status 3 checksum 7:0 checksum signature table 10-49. commands to read the fail-over configuration register byte name bit value description 0 command 7:0 0x27 read fail-over configuration command. 1 buffer length 7:0 0x0 no data attached to this command. 2 7:0 0x0 3 checksum 7:0 checksum signature of the host command. table 10-50. states returned byte name bit value description 0 command 7:0 0x27 fail-over configuration command. 1 buffer length 7:0 0x4 indicates 4 bytes of the fail-over register (7:4 below) 2 return status 7:0 0x1 indicates good status. 3 checksum 7:0 checksum signature. 7:4 data buffer 7:0 fail-over configuration dwords fail-over register content. byte 4 is byte 0 of the configuration register. table 10-47. command for updating fail-over configuration register (continued)
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 862 10.8 macsec and manageability for details on macsec and the role of manageability in it, see section 7.9.1.6 . pass-through mode is supported in a macsec environment in one of the following modes of operations: ? management traffic is protected by macsec - the lan controller supports a single secure channel for both host and bmc. at a given time, the host and mc may be active or inactive. when only mc is active, it acts as the kay controlling the secured channel. the host can act as the kay when it is functional and after it acquires control over macsec. in this case, the mc uses the secured channel set by the host. ? management traffic not protected by macsec - the management traffic from and to the mc is carried over a separate mac address and/or a separate vlan and the network switch is configured to allow such traffic to pass unprotected. the mc controls which transmit packets should go through macsec and which should bypass it. it controls per-packet security through vendor specific messages that enable and disable macsec operation. two usage cases are supported: ? manageability traffic is not protected by macsec (see above paragraph). ? per-packet configuration by the mc - the mc may configure on a per-packet basis whether to apply macsec to a packet, since some packets (e.g. 802.1x control packets) must not go through macsec even if macsec is enabled. the mc controls per-packet security through vendor specific messages that enable and disable macsec operation. for example, if 802.1x packets should not be secured by macsec, the mc must disable macsec operation before sending the 802.1x packets and re-enable macsec operation afterwards. the 82576 must follow the proper ordering of such a sequence (i.e. the set of packets that do not go through macsec). the 82576 provides the following functionality to allow management traffic to share the same secure channel with the host: ? handover of macsec ownership between the mc and the host. several transitions in ownership are possible: ? power-on - the 82576 powers up with macsec not being owned by the bmc. if the mc is configured for macsec, it takes ownership over macsec as described below. if the mc is not configured for macsec, the host takes ownership when it boots. if macsec is not owned by the bmc, the host is not required for any handshake with the mc as there are cases where the mc is not connected to the lan controller. in case of a race between the mc and the host, the mc wins over macsec, and the host is then interrupted so that the macsec resources are not accessible. ? handover of macsec responsibility from mc to host - the host may initiate a transfer of ownership from the mc (e.g. on o/s boot). ? handover of macsec responsibility from host to mc - the host may initiate a transfer of ownership to the mc (e.g. on entry to low power state). this is done through the host slave command interface. ? forced handover of macsec responsibility from host to mc - the mc may acquire ownership of macsec on its own, for example when the host fails to acquire a secure channel. see section 10.8.1 for the different transition sequences. ? configuration of macsec resources by the mc - when the mc owns the secure channel, it configures macsec operation through the smb or nc-si vendor-specific commands. the messages are described in section 10.6.2.15 for nc-si and in section 10.5.10.2.7 and section 10.5.10.1.6 for smbus. ? alerts - the 82576 initiates an smb or nc-si alert to the mc on several macsec events. the exact format of the alerts is defined in section section 10.6.2.16 and section 10.5.10.2.7 .
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 863 ? packet arrived with a macsec error (no sa match, replay detection, or a bad macsec signature). ? key-exchange event - relevant on tx when the packet number counter reaches the exhaustion threshold as described in section 7.9.1.5.1 . ? host request for macsec ownership. ? host request to relinquish macsec ownership. ? interrupt causes - the 82576 issues a management interrupt to the host on the following macsec events: ? acknowledge of handover of macsec responsibility from mc to host. ? forced handover of macsec responsibility from host to bmc. the host may identify the ownership status by reading the os status field in the lswfw register. 10.8.1 handover of macsec responsibility between mc and host 10.8.1.1 kay ownership release by the host the following procedure is used by the host in order to release ownership of the macsec capability. this procedure is usually done before an ordered shutdown of the host. ? the host should stop accessing the macsec registers and set the lswfw.release request bit. ? setting of this bit will cause an interrupt to the fw that will be forwarded to the bmc. ? the mc then will take ownership as described in section 10.8.1.2 . ? the host may then wait for an interrupt from the fw indicating that the mc took the kay ownership. 10.8.1.2 kay ownership takeover by bmc as mentioned above, the mc may acquire ownership over macsec either by ownership relinquish by the host or w/o any negotiation (e.g. on power-up and on a forced transition when the host failed to bring up a macsec connection). the mc acquires ownership of macsec by taking the following actions: ? locking access to macsec resources to the host by setting the lswfw.lock macsec logic bit, therefore indicating its ownership over macsec. ? blocking any transmit host packets from going to the wire by setting the lswfw.block host traffic bit. ? set lswfw.os status to 1 to indicate a takeover of the macsec ? issue a manageability event interrupt to the host. note: this spec does not specify how the mc determines that the host failed to bring up a macsec connection or that the connection was broken once established. it can be done by checking manage to connect to a management console a reasonable time after the macsec ownership was handled to the host and periodically afterwards. as during the normal life of a macsec connection, a re-negotiation process may occur that will prevent the mc from connecting to it?s console for relatively extended period, the timeout before forced ownership taking by the mc should be relatively large. 10.8.1.3 kay ownership request by the host the following procedure is used by the host in order to request ownership of the macsec capability:
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 864 ? the host should read the lswfw.os status field to check if the kay is currently owned by the bmc. ? if kay is owned by the bmc, then the host should set macsec request bit in the lswfw register prior to assuming responsibility over macsec connection. ? setting of this bit will cause an interrupt to the fw that will be forwarded to the bmc. ? the host should then wait for an interrupt from the fw indicating that the mc released the kay ownership. ? it should then check the lswfw.os status and the lswfw.lock macsec logic field to make sure the mc released the kay ownership. ? if the mc decides to deny the release request, it silently ignores the request. 10.8.1.4 kay ownership release by bmc in order to release ownership of macsec, the mc should take the following actions: ? disconnect the macsec connection with the switch (e.g. eap logoff). ? clear the lswfs.lock macsec logic bit to allow host ownership of the macsec registers. ? allow host traffic by clearing the lswfw.block host traffic bit. ? set lswfw.os status to 0 to indicate a release of the macsec. ? issue a manageability event interrupt to the host. ? poll the connection state to check if the macsec channel was set by the host. if the mc decides to deny the release request, it should keep the lswfw.os status at 1 to indicate a denial of the request to release the macsec.
system manageability ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 865 10.8.1.5 control registers the following configuration fields are dedicated for manageability control over macsec. table 10-51. management configuration fields for macsec register.field description lswfw.lock macsec logic serves two purposes. it indicates who owns macsec (default value is host ownership). second, it enables or disables host accesses to the macsec registers. default is to enable. the following registers are blocked: ? macsec tx capabilities register ? lsectxcap. ? macsec rx capabilities register ? lsecrxcap. ? macsec tx control register ? lsectxctrl. ? macsec rx control register ? lsecrxctrl. ? macsec tx sci low ? lsectxscl. ? macsec tx sci high ? lsectxsch. ? macsec tx sa ? lsectxsa. ? macsec tx sa pn 0 ? lsectxpn0. ? macsec tx sa pn 1 ? lsectxpn1. ? macsec tx key 0 ? lsectxkey0 (four registers). ? macsec tx key 1 ? lsectxkey1 (four registers). ? macsec rx sci low ? lsecrxscl. ? macsec rx sci high ? lsecrxsch. ? macsec rx sa ? lsecrxsa (0 and 1). ? macsec rx sa pn ? lsecrxsapn (0 and 1). ? macsec rx key ? lsecrxkey (four registers per sa). ? tx untagged packet counter ? lsectxut. lswfw.lock macsec logic ? encrypted tx packets ? lsectxpkte. ? protected tx packets ? lsectxpktp. ? encrypted tx octets ? lsectxocte. ? protected tx octets ? lsectxoctp. ? macsec untagged non-strict rx packet ? lsecrxutns. ? macsec untagged strict rx packet ? lsecrxutys. ? macsec rx octets decrypted ? lsecrxocte. ? macsec rx octets validated ? lsecrxoctp. ? macsec rx packet with bad tag ? lsecrxbad. ? macsec non-strict rx packet unknown sci ? lsecrxnoscins. ? macsec strict rx packet unknown sci ? lsecrxnosciys. ? macsec rx unchecked packets ? lsecrxnosci. ? macsec rx delayed packets ? lsecrxdelay. ? macsec rx late packets ? lsecrxlate. ? macsec rx packet ok ? lsecrxok. ? macsec check rx invalid ? lsecrxinvck. ? macsec strict rx invalid ? lsecrxinvst. ? macsec strict rx no sa ? lsecrxnsast. ? macsec non strict rx no sa ? lsecrxnsa.
intel ? 82576eb gbe controller ? system manageability intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 866 10.8.2 filtering of non-macsec packets a mc receiving packets from the 82576 can not distinguish between a packet received without macsec protection and a packet for which the macsec envelop was removed by the 82576. reception of packets without macsec protection may be considered as illegal unless they are part of the communication to the radius server or are part of the kay process. the 82576 supports filtering of these illegal packets using the following procedure: 1. program metf[2] to filter using an ethertype of 0x88e5 (kay packets). 2. program metf[3] to filter using an ethertype of 0x888e (eapol). 3. set the macsec filtering bit in manc[27]. this will filter any packet that did not match one of the following conditions: a. the packet is a macsec packet authenticated and/or decrypted adequately by the hw. b. the packet ethertype matchesmetf[2] c. the packet ethertype matches metf[3]. 10.8.3 sending of clear packets in a macsec environment as part of the kay key exchange process, the mc needs to send clear eapol packets. in order to do that the following flow should be used: 1. stop macsec encryption using the ?disable network tx encryption? command. 2. in nc-si mode, wait for the response of the command. 3. send the clear packets. 4. restart macsec encryption using the ?enable network tx encryption? command . 5. in nc-si mode, wait for the response of the command. 6. continue to send regular encrypted packets. lswfw.block host traffic enables or disables host transmit traffic for this pci function from going to the wire. default is to enable. lswfw.os status set by the fw to indicate the status of the macsec ownership: ? 0 - macsec owned by host (default) ? 1 - macsec owned by bmc lswfw.macsec request bit used by host to request kay ownership lswfw.macsec release bit used by host to release kay ownership table 10-51. management configuration fields for macsec register.field description
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 867 11.0 electrical / mechanical specification 11.1 introduction this chapter describes the 82576 dc and ac (timing) electrical characteristics. this includes absolute maximum rating, recommended operating conditions, power sequencing requirements, dc and ac timing specifications. the dc and ac characteristics include generic digital 3.3v io specification as well as other specifications supported by the 82576. for thermal information, see chapter 13.0 .
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 868 11.2 operating conditions 11.2.1 recommended operating conditions 11.3 power delivery 11.3.1 power supply specification table 11-3. external power supply specification table 11-1. absolute maximum ratings 1 1. ratings in this table are those beyond which permanent device damage is likely to occur. these values should not be used as t he limits for normal device operation. exposure to absolute maximum rating conditions for extended periods may affect device reliability. symbol parameter min max units t case 2 2. detailed t case information is in chapter 13.0, thermal design specifications . case temperature under bias see note. ? c t storage storage temperature range ?65 140 ? c vi/vo 3.3v compatible i/os voltage analog 1.0 i/o voltage analog 1.8 i/o voltage vss ? 0.5 vss ? 0.2 vss ? 0.3 4.6 1.68 2.52 v vcc3p3 3.3v periphery dc supply voltage vss ? 0.5 4.6 v vcc 1.0v core dc supply voltage vss ? 0.2 1.68v v vcc1p8 1.8v analog dc supply voltage vss ? 0.3 2.52 v vcc1p0 1.0v analog dc supply voltage vss ? 0.2 1.68v v table 11-2. recommended operating conditions symbol parameter min max units notes ta operating temperature range commercial (ambient; 0 cfs airflow) 055 ? c 1,2,3 notes: 1. for normal device operation, adhere to the limits in this table. sustained operations of a device at conditions exceeding the se values, even if they are within the absolute maximum rating limits, may result in permanent device damage or impaired device reliability. device functionality to stated dc and ac limits is not guaranteed if conditions exceed recommended operating conditions. 2. recommended operation conditions require accuracy of power supply of +/-5% relative to the nominal voltage. 3. this temperature range may require thermal management. see chapter 13.0, thermal design specifications. 4. vcc3p3 (3.3v) parameters title description min max units rise time time from 10% to 90% mark 0.1 100 ms
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 869 monotonicity voltage dip allowed in ramp n/a 0 mv slope ramp rate at any given time between 10% and 90% min: 0.8*v(min)/rise time (max) max: 0.8*v(max)/rise time (min) 24 2880 0 v/s operational range voltage range for normal operating conditions 3 3.6 v ripple maximum voltage ripple (peak to peak) 1 n/a 70 mv overshoot maximum overshoot allowed n/a 100 mv overshoot settling time maximum overshoot allowed duration. (at that time delta voltage should be lower than 5mv from steady state voltage) n/a 0.05 ms 1. the ripple measurement should be performed at 20mhz bw vcc1p8 (1.8v) parameters title description min max units rise time time from 10% to 90% mark 0.1 100 ms monotonicity voltage dip allowed in ramp n/a 0 mv slope ramp rate at any given time between 10% and 90% min: 0.8*v(min)/rise time (max) max: 0.8*v(max)/rise time (min) 14 6000 0 v/s operational range voltage range for normal operating conditions 1.71 1.89 v ripple maximum voltage ripple 1 (peak to peak) 1. the ripple measurement should be performed at 20mhz bw n/a 40 mv overshoot maximum overshoot allowed n/a 100 mv overshoot settling time maximum overshoot allowed duration. (at that time delta voltage should be lower than 5mv from steady state voltage) n/a 0.1 ms decoupling capacitance capacitance range 15 25 f capacitance esr equivalent series resistance of output capacitance n/a 50 m vcc1p0 (1.0v) parameters title description min max units rise time time from 10% to 90% mark 0.1 100 ms monotonicity voltage dip allowed in ramp n/a 0 mv slope ramp rate at any given time between 10% and 90% min: 0.8*v(min)/rise time (max) max: 0.8*v(max)/rise time (min) 7.6 3360 0 v/s
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 870 11.3.1.1 power on/off sequence the following relationships between the rise time of the different power supplies should be maintained at all times when external power supplies are in use to avoid risk of either latch-up or forward-biased internal diodes: t3.3 v supply ? t1.8 v supply ? t1.0 v supply on power-on, after 3.3v reaches 10% of its final value, all voltage rails (1.8v and 1.0v) are allowed 100 ms to reach their final operating values. however, to keep leakage current at a minimum, it is recommended to turn on power supplies almost simultaneously (with delay between supplies at most a few milliseconds). for power-down, it is recommended to turn off all rails at the same time and leave voltage to decay. operational range voltage range for normal operating conditions 0.95 1.08 v ripple maximum voltage ripple (peak to peak) 1 n/a 40 mv overshoot maximum overshoot allowed n/a 100 mv overshoot duration maximum overshoot allowed duration. (at that time delta voltage should be lower than 5mv from steady state voltage) 0.0 0.05 ms decoupling capacitance capacitance range 15 25 f capacitance esr equivalent series resistance of output capacitance 550 m 1. the ripple measurement should be performed at 20mhz bw table 11-4. power sequencing for the 82576 symbol parameter min max units t 3_18 vcc3p3 (3.3v) stable to vcc1p8 stable 0 100 ms t 18_1 vcc1p8 stable to vcc (1.0v) stable 0 ms t 3_1 vcc3p3 (3.3v) stable to vcc (1.0v) stable 0 100 ms tm-per 3.3v core to perst# de-assertion 100 ms tm-ppo 3.3v core to main_pwr_ok on 0 ms
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 871 11.4 dc/ac specification 11.4.1 ball summary see chapter 2.0 for balls description and ball out map. 11.4.2 dc specifications 11.4.2.1 current consumption all the numbers in this section are based on a1 class a measurements. figure 11-1. power and reset sequencing
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 872 * table 11-5. current consumption details condition 1 speed (mbps) 3.3v (ma) 1.8v (ma) 1.0v (ma) total power (mw) d0a - active link (manageability active) 10 typ 15 330 300 943.5 100 typ 15 270 376 911.5 1000 copper typ 15 722 740 2089.1 max 15 744 1220 2810 2 1000 serdes 3 typ 15 200 440 849.5 max 15 240 867 1348.5 2 d0a - active link (manageability off) 10 typ 15 328 279 918.9 100 typ 15 270 355 890.5 1000 copper typ 15 722 711 2060.1 max 15 737 1210 2753 2 1000 serdes 3 typ 15 200 412 821.5 max 15 240 842 1323.5 2 d0a - idle link (manageability active) ? l states disabled no link typ 15 108 352 595.9 10 typ 15 119 296 559.7 100 typ 15 270 371 906.5 1000 copper typ 15 722 692 2041.1 1000 serdes 3 typ 15 149 412 729.7 d0a - idle link (manageability active) ? l0s only no link typ 15 87 351 557.1 10 typ 15 98 293 518.9 100 typ 15 248 369 864.9 1000 copper typ 15 701 692 2003.3 1000 serdes 3 typ 15 128 408 687.9
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 873 d0a - idle link (manageability active) ? l0s & l1 no link typ 15 64 288 452.7 10 typ 15 75 231 415.5 100 typ 15 224 308 760.7 1000 copper typ 15 677 623 1891.1 1000 serdes 3 typ 15 105 349 587.5 d0a - idle link (manageability off) ? l states disabled no link typ 15 108 265 508.9 10 typ 15 119 270 533.7 100 typ 15 268 340 871.9 1000 copper typ 15 722 663 2012.1 1000 serdes typ 15 149 378 695.7 d0a - idle link (manageability off) ? l0s only no link typ 15 108 265 508.9 10 typ 15 119 270 533.7 100 typ 15 268 340 871.9 1000 copper typ 15 700 657 1966.5 1000 serdes 3 typ 15 128 375 654.9 d0a - idle link (manageability off) ? l0s & l1 no link typ 15 65 202 368.5 10 typ 15 75 206 390.5 100 typ 15 226 278 734.3 1000 copper typ 15 678 600 1869.9 1000 serdes 3 typ 15 105 316 554.5 d3cold - wake-up enabled (manageability active) no link typ 15 64 165 329.7 10 typ 15 75 175 359.5 100 typ 15 224 243 695.7 1000 serdes 3 typ 15 104 279 515.7 table 11-5. current consumption details condition 1 speed (mbps) 3.3v (ma) 1.8v (ma) 1.0v (ma) total power (mw)
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 874 11.4.2.2 digital i/o d3cold - wake-up enabled (manageability off) no link typ 15 64 87 251.7 10 typ 15 75 89 273.5 100 typ 15 224 153 605.7 1000 serdes 3 typ 15 104 145 381.7 d3cold-wake disabled (manageability off) no link typ 15 68 88 259.9 d(r) uninitialized disabled through dev_off_n no link typ 15 33 124 232.9 notes: 1. typical conditions: room temperature (ta) = 25 c, nominal voltages and continuous network traffic at link speed at full duplex. 2. maximum conditions: maximum operating temperature (tj) values, max voltage values, continuous network traffic at link speed at full duplex. 3. to estimate power for sgmii mode, use the serdes mode power numbers provided. table 11-6. digital io dc electrical characteristics (note 1) symbol parameter conditions min max units note vcc3p3 periphery supply 3.0 3.6 v vcc core supply 0.95 1.08 v voh output high voltage ioh = -16ma; vcc3p3 = min 2.4 v ioh = -100 ? a; vcc3p3 = min vcc3p 3?0.2 vol output low voltage iol = 13ma; vcc=min 0.43 v iol = 100 ? a; vcc=min 0.2 v iout = 13 ma 0.4 v vih input high voltage 2.0 vcc3p 3 + 0.3 v2 vil input low voltage -0.3 0.8 v 2 iil input current vcc3p3 = max; vi =3.6v/gnd 24.5 a ioff current at iddq mode 50 a3 pu internal pullup 2.8 7.4 k ? 4, 5, 6, 7 table 11-5. current consumption details condition 1 speed (mbps) 3.3v (ma) 1.8v (ma) 1.0v (ma) total power (mw)
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 875 11.4.2.3 open drain i/os ipup internal pull up current 0-0.5*vcc3p3[v] 0.43 1.1 ma 8, 9 built-in hysteresis 100 400 mv vos overshoot n/a 4 v vus undershoot n/a -0.4 v cin input pin capacitance max input capacitance 5 pf 10 cout output pin capacitance max output load capacitance per 160mhz 16 pf 10 notes: 1. entire table applies to pe_rstn, led0[3:0], led1[3:0], sfp0_i2c_clk, sfp1_i2c_clk, srds0_sig_det, srds1_sig_det, dev_off_n, m_pwr_ok, jtck, jtdi, jtdo, jtms, sdp0[3:0], sdp1[3:2], sdp1[0], flsh_si, flsh_so, flsh_sck, flsh_ce_n, ee_di, ee_do, ee_sk, ee_cs_n. 2. the input buffer also has hysteresis > 160mv 3. iddq mode maximum current consumption: core_vdd: 15ma; vcc3p3: 35ma 4. internal pullup max was characterized at slow corner (110c, vcc3p3=min, process slow); internal pullup min was characterized at fast corner (0c, vcc3p3=max, process fast). 5. external r pull_down recommended is 400w 6. external r pull_up recommended is 3kw 7. external buffer recommended strength 8. internal pull-up max current consumption was characterized at fast corner(0c, vcc3p3=max, process fast) 9. internal pull-up min current consumption was characterized at slow corner(115c, vcc3p3=min, process slow) 10. characterized, not tested. table 11-7. open drain dc specifications (note 1, 4) symbol parameter condition min max units note vcc3p3 periphery supply 3.0 3.6 v vcc core supply 0.9 1.32 v vih input high voltage 2.1 v vil input low voltage 0.8 v ileakage output leakage current 0 < vin < vcc3p3 +/-17 a2 vol output low voltage @ ipullup 0.4 v 4 ipullup current sinking vol=0.4v 4 ma cin input pin capacitance 7 pf 3 table 11-6. digital io dc electrical characteristics (note 1) (continued) 2 ma ?
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 876 11.4.2.4 nc-si input and output pads 11.4.3 digital i/f ac specifications 11.4.3.1 digital i/o ac specifications cout output pin capacitance f = 5 mhz 16 pf 3 ioffsmb input leakage current vcc3p3 off or floating +/-10 a2 note: 1. applies to smbd0, smbclk0, smbalrt _n, pe_wake_n pads. 2. device meets this whether powered or not. 3. characterized, not tested. 4. od no high output drive. vol max=0.4v at 16ma, vol max=0.2v at 0.1ma table 11-8. nc-si pads dc specifications symbol parameter conditions min max units vcc3p3 periphery supply 3.0 3.6 v vcc core supply 0.9 1.32 v voh output high voltage ioh = -4ma; vcc3p3 = min 2.4 v vol output low voltage iol = 4ma; vcc3p3 = min 0.4 v vih input high voltage 2.0 v vil input low voltage 0.8 v vihyst input hysteresis 100 mv iil/iih input current vcc3p3 = max; vin =3.6v/gnd 15 a ioll/iohl output current vcc = vol/voh. low driving strength 4 ma iolh/iohh output current vcc = vol/voh. high driving strength 8 ma cin input capacitance 5pf ipup pull-up current vout = 0v (gnd) 0.4 1.3 ma ioff current at iddq mode vcc3p3 (periphery) 80 a note: 1. applies to the nc-si_arb_out, nc-si_clk_out, nc-si_crs_dv, nc-si_rxd[1:0], sdp1[1] - input/output pads and nc- si_tx_en, nc-si_txd[1:0], nc-si_clk_in, nc-si_arb_in - input pads. table 11-9. digital i/o ac electrical and timing characteristics parameters description min max cload note tor output time rise 0.2ns 1ns 16pf tof output time fall 0.2ns 1ns table 11-7. open drain dc specifications (note 1, 4) (continued)
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 877 todr output delay rise 0.8ns 3ns todf output delay fall 0.8ns 3ns tidr input delay rise 0.3ns 1.5ns 200ff 1 tidf input delay fall 0.3ns 1.5ns 1 tir input time rise 0.03ns 0.1ns 1 tif input time fall 0.03ns 0.1ns 1 note: 1. the input delay test conditions: maximum input level = vin = 2.7v; input rise/fall time (0.2vin to 0.8vin) = 1ns (slew rate ~ 1.5ns). figure 11-2. digital i/o output timing diagram table 11-9. digital i/o ac electrical and timing characteristics
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 878 11.4.3.2 reset signals the timing between the power up sequence and the different reset signals is described in figure 4-2 and in table 4-1 . 11.4.3.2.1 internal_power_on_reset the 82576 uses an internal power on detection circuit in order to generate the internal_power_on_reset signal. reset can also be implemented when the external power on detection circuit determines that the device is powered up and asserts the internal_power_on_reset signal to reset the device. figure 11-3. digital io input timing diagram
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 879 11.4.3.3 smbus the following table indicates the timing guaranteed when the driver or the agent is performing the action. where only a typical value is specified, the actual value will be within 2% of the value indicated. the following table indicates the timing requirements of the 82576 when it is the receiver of the indicated signal. many of these are below minimums specified by the smbus specification. table 11-10. smbus timing parameters (master mode) symbol parameter min typ max units f smb smbus frequency 84 100 khz t buf time between stop and start condition driven by the 82576 6.56 s t hd:sta hold time after start condition. after this period, the first clock is generated. 6.72 s t su:sta start condition setup time s t su:sto stop condition setup time 6.88 s t hd:dat data hold time 0.48 s t timeout detect smbclk low timeout 26.2 31.5 ms t low smbclk low time 5.76 s t high smbclk high time 6.56 s table 11-11. smbus timing parameters (slave mode) symbol parameter min typ max units f smb smbus frequency 10 400 khz t buf time between stop and start condition driven by the 82576 . 1.44 s t hd:sta hold time after start condition. after this period, the first clock is generated. 0.48 s t su:sta start condition setup time 1.6 s t su:sto stop condition setup time 1.76 s t hd:dat data hold time 0.32 s t low smbclk low time 0.8 s t high smbclk high time 1.44 s
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 880 11.4.3.4 flash ac specification the 82576 is designed to support a serial flash. applicable over the recommended operating range from ta = -40c to +85c, vcc3p3 = 3.3v, cload = 1 ttl gate and 16 pf (unless otherwise noted). for flash i/f timing specification table 11-12 and figure 11-5 . figure 11-4. smbus i/f timing diagram table 11-12. flash i/f timing parameters symbol parameter min typ max units note t sck sck clock frequency 0 15.625 20 mhz 1 t ri input rise time 2.5 20 ns t fi input fall time 2.5 20 ns t wh sck high time 20 32 ns 2 t wl sck low time 20 32 ns 2 t cs cs high time 25 ns t css cs setup time 25 ns t csh cs hold time 25 ns t su data-in setup time 5 ns t h data-in hold time 5 ns t v output valid 20 ns t ho output hold time 0 ns t dis output disable time 100 ns note: 1. clock is 62.5mhz divided by 450% duty cycle. 2. 50% duty cycle.
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 881 11.4.3.5 eeprom ac specification the 82576 is designed to support a standard serial eeprom. applicable over recommended operating range from ta = -40c to +85c, vcc3p3 = 3.3v, cload = 1 ttl gate and 16pf (unless otherwise noted). for eeprom i/f timing specification see table 11-13 and figure 11-6 . figure 11-5. flash timing diagram table 11-13. eeprom i/f timing parameters symbol parameter min max units note t sck sck clock frequency 0 2.1 mhz 1 t ri input rise time 2 s t fi input fall time 2 s t wh sck high time 200 ns 2 t wl sck low time 200 ns t cs cs high time 250 ns t css cs setup time 250 ns t csh cs hold time 250 ns t su data-in setup time 50 ns t h data-in hold time 50 ns t v output valid 0 200 ns
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 882 figure 11-6. eeprom timing diagram 11.4.3.6 nc-si ac specification the 82576 is designed to support the standard dmtf nc-si interface. for nc-si i/f timing specification see table 11-14 and figure 11-7 . t ho output hold time 0 ns t dis output disable time 250 ns notes: 1. clock is 2mhz 2. 50% duty cycle table 11-14. nc-si ac specifications symbol parameter min typ max units notes tckf ncsi_clk_in frequency 50 mhz 2 rdc ncsi_clk_in duty cycle 35 65 % 1 racc ncsi_clk_in accuracy 100 ppm tco clock-to-out (10 pf =< cload <=50 pf) ncsi_rxd[1:0], ncsi_crs_dv data valid from ncsi_clk_in rising edge 2.5 12.5 ns 4 tsu ncsi_txd[1:0], ncsi_tx_en, data setup to ncsi_clk_in rising edge 3ns thold ncsi_txd[1:0], ncsi_tx_en data hold from ncsi_clk_in rising edge 1ns table 11-13. eeprom i/f timing parameters
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 883 11.4.3.7 jtag ac specification the 82576 is designed to support the ieee 1149.1 standard. following timing specifications are applicable over recommended operating range from ta = 0 o c to +70 o c, vcc3p3 = 3.3v, cload = 16pf (unless otherwise noted). for jtag i/f timing specification see table 11-15 and figure 11-8 . tor ncsi_rxd[1:0], ncsi_crs_dv output time rise 0.5 6 ns 3 tof ncsi_rxd[1:0], ncsi_crs_dv output time fall 0.5 6 ns 3 tckr/tckf ncsi_clk_in rise/fall time 0.5 3.5 ns tckor/tckof ncsi_clk_out rise/fall time 0.5 3.5 ns 5 notes: 1. clock duty cycle measurement: high interval measured from vih to vil points, low from vil to next vih. 2. clock interval measurement from vih to vih. 3. cload = 25 pf. 4. this timing relates to the output pins, while tsu and thd relate to timing at the input pins 5. 10 pf =< cload <= 30 pf figure 11-7. nc-si timing diagram table 11-15. jtag i/f timing parameters symbol parameter min typ max units note t jclk jtck clock frequency 10 mhz t jh jtms and jtdi hold time 10 ns t jsu jtms and jtdi setup time 10 ns t jpr jtdo propagation delay 15 ns notes: the table applies to jtck, jtms, jtdi and jtdo. timing measured relative to jtck reference voltage of vcc3p3/2. table 11-14. nc-si ac specifications (continued) symbol parameter min typ max units notes
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 884 11.4.3.8 mdio ac specification the 82576 is designed to support the mdio specifications defined in ieee 802.3 clause 22. following timing specifications are applicable over recommended operating range from ta = 0 o c to +70 o c, vcc3p3 = 3.3v, cload = 16pf (unless otherwise noted). for mdio i/f timing specification see table 11- 16 , figure 11-9 and figure 11-10 . figure 11-8. jtag ac timing diagram table 11-16. mdio i/f timing parameters symbol parameter min typ max units note t mclk mdc clock frequency 2.5 mhz t mh mdio hold time 10 ns t msu mdio setup time 10 ns t mpr mdio propagation delay 10 300 ns notes: the table above applies to mdio0, mdc0, mdio1, mdc1, mdio2, mdc2, mdio3, and mdc3. timing measured relative to mdc reference voltage of 2.0v (vih).
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 885 11.4.3.9 sfp 2 wires i/f ac specification according to atmel's at24c01a/02/04 definition of the 2 wires i/f bus. 11.4.3.10 pcie/serdes dc/ac specification the transmitter and receiver specification are given per pcie card electromechanical specification rev 1.0. 11.4.3.11 pcie specification - receiver specifications are from pcie v2.0 (2.5gt/s). figure 11-9. mdio input ac timing diagram figure 11-10. mdio output ac timing diagram
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 886 11.4.3.12 pcie specification - transmitter specifications are from the pcie v2.0 (2.5gt/s). 11.4.3.13 pcie specification - input clock the input clock for pcie must be a differential input clock in frequency of 100 mhz. for full specifications please check the pci-express card electromechanical specifications (refclk specifications). 11.4.4 serdes dc/ac specification the serdes interface supports the picmg 3.1 (1000base-bx), and sfp standards, this spec defines the interface for the back-plane board connection and the interface to fiber or sfp module. 11.4.4.1 serdes specification - receiver specifications are from picmg@ 3.1draft specification rev 1.0 1000base-bx. 11.4.4.2 serdes specification - transmitter specifications are from picmg@ 3.1draft specification rev 1.0 1000base-bx. 11.4.4.3 serdes specification -input clock the input clock for serdes is 25 mhz input crystal. 11.4.5 phy specification dc/ac specification is according to standard 802.3 and 802.3ab. 100 base-t parameters are also described in standard ansi x3.263. 11.4.6 xtal/clock specification the 25 mhz reference clock of the 82576 can be supplied either from a crystal or from an external oscillator. the recommended solution is to use a crystal. 11.4.6.1 crystal specification table 11-17. specification for external crystal parameter name symbol recommended value conditions frequency f o 25.000 [mhz] @25 [c] vibration mode fundamental
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 887 11.4.6.2 external clock oscillator specification when using external oscillator the following connection must be used. cut at operating /calibration mode parallel frequency tolerance @25c ? f/f o @25c 30 [ppm] @25 [c] temperature tolerance ? f/f o 30 [ppm] operating temperature t opr -20 to +70 [c] non operating temperature range t opr -40 to +90 [c] equivalent series resistance (esr) r s 50 [ ? ] maximum @25 [mhz] load capacitance c load 20 [pf] shunt capacitance c o 6 [pf] maximum pullable from nominal load capacitance ? f/c load 15 [ppm/pf] maximum max drive level d l 0.5 [ m w] insulation resistance ir 500 [m ? ] minimum @ 100v dc aging ? f/f o 5 [ppm/year] external capacitors c 1 , c 2 27 [pf] board resistance r s 0.1 [ ? ] figure 11-11. external clock oscillator connectivity to the 82576. table 11-17. specification for external crystal (continued)
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 888 11.4.7 rbias connection for the phy circuit, an external resistor of 1.4k ? (accuracy 1%) is used as reference for the internal bias currents. this resistor is connected to balls rbias0/1 and gnd as described below. short connections for this resistor are compulsory. place the resistor as close as possible to the device (less than 1"). table 11-18. specification for external clock oscillator parameter name symbol value conditions frequency f o 25.0 [mhz] @25 [c] external osc supply swing v p-p 3.3 0.3 [v] xtal1 swing v scp-p 1.2 0.1 [v] frequency tolerance ? f/f o 50 [ppm] -20 to +70 [c] operating temperature t opr -20 to +70 [c] aging ? f/f o 5 ppm per year coupling capacitor c coupling 10 2 [pf] figure 11-12. phy rbias connection
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 889 for the pcie and serdes circuits, an external resistor of the same type is used as reference for the internal bias currents. this resistor is connected to balls pe_rcomp/ser_rcomp and gnd as described below. short connections for this resistor are compulsory. place the resistor as close as possible to the device (less than 1"). 11.5 eeprom flash devices while intel does not make recommendations regarding these devices, the following devices have been used successfully in previous designs. 11.5.1 flash type: spi flash size: 256 kbytes (typical), depending on application figure 11-13. rcomp connection table 11-19. serial flash table density intel pn atmel pn stm pn sst pn 512kbit at25f512n-10si-2.7 m25p05-avmn6t sst25vf512a 1mbit at25f1024n-10si-2.7 m25p10-avmn6t sst25vf010a 2mbit at25f2048n-10si-2.7 m25p20-avmn6t 4mbit at25f4096n-10si-2.7 m25p40-avmn6t sst25vf040a
intel ? 82576eb gbe controller ? electrical / mechanical specification intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 890 11.5.2 eeprom device options the standard recommendation is for a 16-kbyte (for example: at25128) eeprom for when manageability is not employed; a 32-kbyte (for example: at25256) eeprom when manageability is used. 11.6 package information 11.6.1 mechanical the 82576 is assembled into fcbga5 package. 8mbit m25p80-avmn6t sst25vf080a 16mbit qb25f160s33t60 qb25f160s33b60 qh25f160s33t60 qh25f160s33b60 qb25f016s33t60 qb25f016s33b60 qh25f016s33t60 qh25f016s33b60 m25p16-avmn6t 32mbit qb25f320s33t60 qb25f320s33b60 qh25f320s33t60 qh25f320s33b60 m25p32-avmn6t table 11-20. eeprom device options density [kbytes] atmel pn stm pn catalyst pn 16 at25160an-10si-2.7 m95160wmn6t cat25c16s-te13 32 at25320an-10si-2.7 m95320wmn6t cat25c32s-te13 64 at25640an-10si-2.7 m95640wmn6t cat25c64s-te13 128 at25128an-10si-2.7 m95128wmn6t cat25cs128-te13 256 at25256an-10si-2.7 m95256wmn6t table 11-21. intel? 82576 gbe controller package mechanical specifications body size ball count ball pitch ball matrix 25mmx25m m 576 1.0 mm 24x24 table 11-19. serial flash table (continued)
electrical / mechanical specification ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 891 11.6.2 intel? 82576 gbe controller package the table below provides package height information. package schematics follow after the table. 11.6.2.1 package schematics package schematics follow. table 11-22. package height balls substrate uf die total .5mm+/-.1 .990mm +/- .06 .085mm +/- .025 .785mm +/- .025 ------------ 2.36 = assembled unit
intel ? 82576eb gbe controller ? intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 892
? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 893
intel ? 82576eb gbe controller ? intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 894
? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 895
intel ? 82576eb gbe controller ? intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 896 1 3 4 5 6 7 8 b c d a 1 2 3 4 5 6 7 8 b c d a this drawing contains intel corporation confidential information. it is disclosed in confidence and its contents may not be disclosed, reproduced, displayed or modified, without the prior written consent of intel corporation. sheet 1 of 4 do not scale drawing scale: none 1 d17982 34649 d rev drawing number cage code size generic mechanical, fcbga5, 25mm x 25mm, 576 lands title 2200 mission college blvd. p.o. box 58119 santa clara, ca 95052-8119 corp. r department see note 6 finish: see note 5 material: date approved by date checked by 04/05/05 j. laird date drawn by 04/05/05 q. zhou date designed by unless otherwise specified interpret dimensions and tolerances in accordance with asme y14.5m-1994 dimensions are in millimeters tolerances: .x 0.2 angles 0.2 .xx 0.13 .xxx 0.05 third angle projection revision history zone rev description date approved 1 prelliminary release 04/05/05 qz d17982 1 1 dwg. no sht. rev parts list description part number item no qty per assy -001 -002 -003 dash number ink swatch -001 yes -002 no notes: 1. all material shall be approved by intel. 2. data enclosed in parentheses is for reference only. 3 refer to design rules for maximum die size. 4 handling exclusion zone. there shall be no components or epoxy allowed. 1mm edge exclusion zone applies on tongue side of epoxy dispense. final tongue position defined in product assembly data sheet found in speed bom of substrate part number. 5. package: fiber reinforced resin core with resin only build-up layers. solder resist: color: green thickness: see table of material (sheet 4) copper thickness: see table of materials (sheet 4) 6. surface finfish: (enig) electroless nickel gold land and fiducial nickel plating thickness: 3um min. land and fiducial gold plating thickness: 0.075um0.045um. 7 suction cup handling exclusion zone (cross-hatched area). there shall be no components or exposed metal allowed except for fiducials. 8 ink swatch area: grey ink thicnkess: min 6.5um max 20um
? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 897 1 3 4 5 6 7 8 b c d a 1 2 3 4 5 6 7 8 b c d a this drawing contains intel corporation confidential information. it is disclosed in confidence and its contents may not be disclosed, reproduced, displayed or modified, without the prior written consent of intel corporation. c 0 23x 1 23x 1 ? 0.203 c a b ? 0.04 c ? solder resist opening 0.56 0.02 ? (metal diameter) 0.63 0.01 0.015 min "t" 11.5 11.5 0.475 0.1 0.475 0.1 0.488 0.05 0.488 0.05 3x 0.04 b 0.2 c a 25 0.05 a 25 0.05 sheet 2 of 4 do not scale drawing scale: 15 2200 mission college blvd. p.o. box 58119 santa clara, ca 95052-8119 corp. r 1 d17982 34649 d rev drawing number cage code size department d17982 2 1 dwg. no sht. rev layer count dimension "t" 4 0.990.06 6 1.080.06 8 1.170.085 ac ad bottom/land side 0.2 c 1 23 4 5 6 7 8 9 1011121314 a b c d e f g h j k l m n p 15 22 r u 21 20 19 18 17 16 v w y aa ab t b see detail c 23 24 front side solder resist back side solder resist detail b scale 50 pin #1 id detail c scale 100 576 places
intel ? 82576eb gbe controller ? intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 898 note: this page intentionally left blank. 1 3 4 5 6 7 8 b c d a 1 2 3 4 5 6 7 8 b c d a this drawing contains intel corporation confidential information. it is disclosed in confidence and its contents may not be disclosed, reproduced, displayed or modified, without the prior written consent of intel corporation. c h h c 0.025 +0.025 -0.015 0.021 2x 5.5 9.5 9.5 2x 5.5 9.5 11.5 9.5 0.5 c a 2x 11 0.45 0.5 c b 2x 11 0.45 0.5 c b 2x 2.5 0.2 0.5 c a 1.5 0.2 0.5 c a 2.5 0.2 1.5 0.05 0.57 0.1 0.57 0.1 3x 0.07 ? 1 0.05 ? 1.15 0.05 ? 0.65 0.05 11.3 11.3 11.675 11.675 ? 0.5 0.05 1.5 0.05 () 25 () 25 sheet 3 of 4 do not scale drawing scale: 15 2200 mission college blvd. p.o. box 58119 santa clara, ca 95052-8119 corp. r 1 d17982 34649 d rev drawing number cage code size department d17982 3 1 dwg. no sht. rev f e d top/die side 4x 4x 4x b a detail f scale 40 pin #1 id detail e scale 40 surface alignment fiducial 2 (saf2) ? 0.203 c a b detail d 2x scale 40 surface alignment fiducial 1 (saf1) ? 0.203 c a b section h-h see detail g detail g scale 60 0.015 c 0.127 c lga land 576 places 3 7 4 8
? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 899 1 3 4 5 6 7 8 b c d a 1 2 3 4 5 6 7 8 b c d a this drawing contains intel corporation confidential information. it is disclosed in confidence and its contents may not be disclosed, reproduced, displayed or modified, without the prior written consent of intel corporation. sheet 4 of 4 do not scale drawing scale: 4 2200 mission college blvd. p.o. box 58119 santa clara, ca 95052-8119 corp. r 1 d17982 34649 d rev drawing number cage code size department d17982 4 1 dwg. no sht. rev table of materials fcbga5 (4-8 layers) code layer # description thickness (mm) tolerance +/- (mm) a front solder resist 0.021 0.0075 2f and higher for each additional buildup layer, repeat 1-2f and 2f layers d 2f copper 0.015 0.005 e 1-2f dielectric 0.03 0.006 g 1f copper 0.025 (0.023 if pth's are uncapped) 0.005 h inner core dielectric 0.8 0.05 j 1b copper 0.025 (0.023 if pth's are uncapped) 0.005 l 1-2b dielectric 0.03 0.006 m 2b copper 0.015 0.005 2b and higher for each additional buildup layer, repeat 1-2b and 2b layers r back solder resist 0.021 0.0075 a b c d e n r l m p h stackup detail g j
intel ? 82576eb gbe controller ? intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 900 note: this page intentionally left blank .
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 901 12.0 design guidelines this chapter provides guidelines for selecting components and connecting interfaces. 12.1 82575/82576 this section highlights implementation requirements necessary to support the 82575 and 82576 using the same physical printed circuit board design. these devices have similar pin out requirements and functionality. if a design does not implement a manageability (nc-si) solution, then there are no special pin requirements for supporting both from a board perspective. the pin-out differences between the 82575 and 82576 are only associated with the support for nc-si multi-drop applications. 12.1.1 pin out compatibility the pin differences between the 82575 and 82576 affect the nc-si interface used for the manageability features of the devices. the 82576 adds hardware arbitration support for systems that need nc-si multi-drop. this results in a change of two pins. table 12-1 shows the two pins forthe 82575 and 82576 controllers that have different functions. the effects of these differences are examined in the next section. 82576 pin descriptions: ? nc-si_arb_in - nc-si hw arbitration token input pin. ? nc-si_arb_out - nc-si hw arbitration token output pin. 82575 pin descriptions: ? ncad3 - no connect pin requirement. ? ncb3 - no connect pin requirement. table 12-1. 82575 / 82576 pin differences 82576 pin name ball location 82575 pin name nc-si_arb_in ad3 ncad3 nc-si_arb_out b3 ncb3
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 902 12.1.1.1 printed circuit board requirements in order to design a board that supports both the the 82575 and 82576 implementations, it is necessary to connect both ad3 and b3 balls to optionally stuff, 0-ohm resistors that could be connected to the platform for either implementation. if the nc-si interface is not used, then leaving these pins as ?no connects? is acceptable. 12.1.1.2 82576 design to enable a board design for the the 82576, stuff the 0-ohm resistors (connected to ad3 and b3) that could then be connected to the appropriate nc-si arbitration pins of the platform. this is the only special requirement for implementing the 82576 silicon in a design to support both silicon devices. 12.1.1.3 82575 design to enable a board design for the 82575, do not stuff the 0-ohm resistors (ad3 and b3) connections to the silicon. these pins are required as ?no connect? pins. 12.2 port connection to the device this device implements signals required by pcie v2.0 (2.5gt/s). this section provides an overview. pcie is a dual simplex point-to-point serial differential low-voltage interconnect. the signaling bit rate is 2.5 gbps per lane per direction. each port consists of a group of transmitters and receivers located on the same chip. each lane consists of a transmitter and a receiver pair. a link between the ports of two devices is a collection of lanes. the device supports up to four lanes on the pcie interface. each signal is 8b/10b encoded with an embedded clock. the topology consists of a transmitter (tx) located on one device connected through a differential pair connected to the receiver (rx) on a second device. the controller may be located on the motherboard or on an add-in card using a connector specified. the lane is ac-coupled between its corresponding transmitter and receiver. the ac-coupling capacitor is located on the board close to transmitter side. each end of the link is terminated on the die into nominal 100 ?? differential dc impedance. board termination is not required. for more information, see section 3.1 . 12.2.1 pcie reference clock the device uses a 100 mhz differential reference clock, denoted pe_clk_p and pe_clk_n. this signal is typically generated on the system board and routed to the pcie port. for add-in cards, the clock will be furnished at the pcie connector. the frequency tolerance for the pcie reference clock is +/- 300 ppm.
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 903 12.2.2 other pcie signals the device implements other signals required by the specification. the ethernet controller signals power management events to the system using the pe_wake_n signal, which operates very similarly to the familiar pci pme# signal. there is a pe_rst_n signal which serves as the familiar reset function for the controller. 12.2.3 physical layer features 12.2.3.1 link width configuration the device supports a maximum link width of x4, x2, or x1, refer to section section 3.1.6.1 . 12.2.3.2 polarity inversion for details on pci express polarity inversion supported by the 82576 refer to section section 3.1.6.2 . 12.2.3.3 lane reversal the following lane reversal modes are supported (see figure 12-1 ): ? lane configuration of x4, x2, and x1 ? lane reversal in x4 and in x2 ? degraded mode (downshift) from x4 to x2 to x1 and from x2 to x1, with one restriction - if lane reversal is executed in x4, then downshift is only to x1 and not to x2. these restrictions require that a x2 interface to the 82576 must connect to lanes 0 &1 on the 82576. the pcie card electromechanical specification does not allow routing a x2 link to a wider connector. therefore, the system designer is not allowed to connect a x2 link to lanes 2 and 3 of a pcie connector. it is also recommended that, when using x2 mode on a network interface card, the 82576 be connected to lanes 0 & 1 of the card. for further details on lane reversal details refer to section 3.1.6.5 .
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 904 12.2.4 pcie routing for information regarding the pcie signal routing, contact intel for information. 12.3 ethernet component design guidelines these sections provide recommendations for selecting components and connecting special pins. for 1000 base-t designs, the main design elements are: the 82576, an integrated discrete or magnetics module with rj-45 connector, an eeprom, and a clock source. 12.3.1 general design considerations for ethernet controllers follow good engineering practices with respect to unused inputs by terminating them with pull-up or pull-down resistors, unless the datasheet, design guide or reference schematic indicates otherwise. do not attach pull-up or pull-down resistors to any balls identified as no connect. these devices may access test modes that could be entered unintentionally. figure 12-1. lane reversal supported modes
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 905 12.3.1.1 clock source all designs require a 25 mhz clock source. the 82576 uses the 25 mhz source to generate clocks up to 125 mhz and 1.25 ghz for the phy circuits, and 1.25 ghz for the serdes. for optimum results with lowest cost, connect a 25 mhz parallel resonant crystal and appropriate load capacitors at the xtal1 and xtal2 leads. the frequency tolerance of the timing device should be 30 ppm or better. see the intel fast ethernet controllers timing device selection guide, ap-419 . for information regarding the clock, see the sections on frequency control ( section 12.3 ), crystals ( section 12.4 ), and oscillators ( section 12.5 ). when selecting a crystal as the clock source, refer to crystal layout considerations in section 12.7.1.2.1 . 12.3.1.2 magnetics for 1000 base-t magnetics for the 82576 can be either integrated or discrete. the magnetics module has a critical effect on overall ieee and emissions conformance. occasionally, components that meet basic specifications may cause the system to fail ieee testing because of interactions with other components or the printed circuit board itself. carefully qualifying new magnetics modules prevents this problem. when using discrete magnetics it is necessary to use bob smith termination: use four 75 ? resistors for cable-side center taps and unused pins. this method terminates pair-to-pair common mode impedance of the cat5 cable. use an eft capacitor attached to the termination plane. suggested values are 1500 pf/2kv or 1000 pf/ 3kv. a minimum of 50-mil spacing from capacitor to traces and components should be maintained. 12.3.1.2.1 magnetics module qualification steps the steps involved in magnetics module qualification are similar to those for crystal qualification: 1. verify that the vendor?s published specifications in the component datasheet meet or exceed the required ieee specifications. 2. independently measure the component?s electrical parameters on the test bench, checking samples from multiple lots. check that the measured behavior is consistent from sample to sample and that measurements meet the published specifications. 3. perform physical layer conformance testing and emc (fcc and en) testing in real systems. vary temperature and voltage while performing system level tests. 12.3.1.2.2 magnetics module for 1000 base-t ethernet magnetics modules for 1000 base-t ethernet are similar to those designed solely for 10/100 mbps, except that there are four differential signal pairs instead of two. use the following guidelines to verify specific electrical parameters: 1. verify that the rated return loss is 19 db or greater from 2 mhz through 40 mhz for 100/1000 base-tx. 2. verify that the rated return loss is 12 db or greater at 80 mhz for 100 base-tx (the specification requires greater than or equal to 10 db). 3. verify that the rated return loss is 10 db or greater at 100 mhz for 1000 base-tx (the specification requires greater than or equal to 8 db).
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 906 4. verify that the insertion loss is less than 1.0 db at 100 khz through 80 mhz for 100 base-tx. 5. verify that the insertion loss is less than 1.4 db at 100 khz through 100 mhz for 1000 base-t. 6. verify at least 30 db of crosstalk isolation between adjacent channels (through 150 mhz). 7. verify high voltage isolation to 15000 vrms. (does not apply to discrete magnetics.) 8. transmitter ocl should be greater than or equal to 350 ? h with 8 ma dc bias. 12.3.1.2.3 third-party magnetics manufacturers the following magnetics modules have been used successfully in previous designs. 12.3.1.2.4 layout guidelines for use with integrated and discrete magnetics layout requirements are slightly different when using discrete magnetics. these include: ? ground cut for hv installation (not required for integrated magnetics) ? maximum of two (2) vias ? turns less than 45 ? discrete terminators 12.3.2 designing with the 82576 this section provides guidelines specific to the 82576. 12.3.2.1 lan disable the device has three signals that can be used for disabling ethernet functions from system bios. lan0_dis_n and lan1_dis_n are the separate port disable signals and dev_off_n is the device disable signal. each signal can be driven from a system output port. choose outputs from devices that retain their values during reset. for example, ich7 resumes gpio outputs (gp24, 25, 27, 28) transition high during reset. it is important not to use these signals to drive lan0_dis_n or lan1_dis_n because these inputs are latched upon the rising edge of pe_rst_n or an inband reset end. the dev_off_n input is completely asynchronous and does not have this restriction. for details on the operation on the lan disable options, see section 4.2.3 . for details on the operation on the device disable options, see section 4.4 . table 12-2. magnetics manufacturers manufacturer part number pulse h5007 bel (discrete) bel 0344fla
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 907 12.3.2.2 serial eeprom the device uses an serial peripheral interface (spi)* eeprom. several words of the eeprom are accessed automatically by the device after reset to provide pre-boot configuration data before it is accessed by host software. the remainder of the eeprom space is available to software for storing the mac address, serial numbers, and additional information. see also: section 3.3.1 and section 11.5.2 . 12.3.2.2.1 eeprom-less operation the device can be operated without an eeprom, but conditions apply. use of an eeprom is highly recommended. refer to section 3.3.1.7 . 12.3.2.2.2 spi eeproms atmel's at25128n and microchips 25lc128 serial eeproms have been fully validated with the device and have been found to work satisfactorily. alternate spi eeproms that have been found to work are listed in table 11-20 . spi eeproms must be rated for a clock rate of at least 2mhz. the recommended eeprom size is 32k byte (256k bit) eeprom for all applications. 12.3.2.2.3 eeupdate intel has an ms-dos* software utility called eeupdate. this utility can be used to program eeprom images in development or production line environments. to obtain a copy, contact your intel representative. 12.3.2.3 flash for information on the eeprom interface and operation refer to section 3.3.4 . 12.3.2.3.1 flash device information while intel does not make specific recommendations regarding flash devices, see table 11-19 for a list of options. 12.3.3 smbus and nc-si smbus and nc-si are optional interfaces for pass-through and configuration traffic between the mc and the device. note: intel recommends that the smbus be connected to the ich or mc for the eeprom recovery solution. if the connection is to a bmc, it will be able to send the eeprom release command. the 82576 can be connected to an external bmc. it operates in one of two modes: ? smbus mode ? nc-si mode
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 908 the clock-out (if enabled) is provided in all power states (unless the device is disabled). for more information on system management solutions, see chapter 10.0 12.3.4 nc-si electrical interface requirements this section describes the hardware implementation requirements necessary to meet the nc-si physical layer standard. board-level design requirements are included for connecting the 82576 to an external baseboard management controller (bmc). the layout and connectivity requirements are addressed in low-level detail. this section, in conjunction with the network controller sideband interface specification also provides the complete board-level requirements for the nc-si solution. the 82576?s on-board system management bus (smbus) port enables network manageability implementations required for remote control and alerting via the lan. with smbus, management packets can be routed to or from an mc. enhanced pass-through capabilities also enable system remote control over standardized interfaces. also included is a new manageability interface, nc-si that supports the dmtf preos sideband protocol. an internal management interface called mdio enables the mac (and software) to monitor and control the phy. figure 12-2. external mc connections with nc-si and smb
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 909 12.3.4.1 external baseboard management controller (bmc) the external mc is required to meet the latest nc-si specification as it relates to the rmii electrical interface. 12.3.4.2 schematic showing pull-ups and pull-downs for nc-si interface figure 12-3 shows the recommended pull-up and pull-downs to be used on the nc-si interface regardless of the application, even when not using the interface. ? ncsi_clk_in: series termination at the mc and the 82576 should be used when using the interface. a pull down should always be placed on this input to the 82576. ? ncsi_crs_dv: a pull-down should always be used to ensure the data valid is never indicated during start-up when the signal is not driven. ? ncsi_rxd_0/1: pull-up resistors should always be used to ensure these lines at ?1? when nothing is driven or interface is not used and for multiple drop configurations. ? ncsi_tx_en: a pull-down should always be used to ensure the tx is never enabled during start-up or when the signal is not driven. ? ncsi_txd_0/1: pull-up resistors should always be used to ensure these lines at ?1? when nothing is driven or interface is not used and for multiple drop configurations.
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 910 when using the nc-si interface is used the mc typically requires series termination close to these transmitters. 12.3.4.3 resets it is important to ensure that the resets for the mc and the 82576 are generated within a specific time interval. the important requirement here is ensuring that the nc-si link is established within two seconds of the mc receiving the power good signal from the platform. both the 82576 and the external mc need to receive power good signals from the platform within one second of each other. this causes an internal power on reset within the 82576 and then initialization as well as a triggering and initialization sequence for the bmc. once these power good signals are received by both the 82576 and the external bmc, the nc-si interface can be initialized. the nc-si specification calls out a requirement of link establishment within two seconds. the mc should poll this interface and establish a link for two seconds to ensure specification compliance. 12.3.4.4 layout requirements 12.3.4.4.1 board impedance figure 12-3. nc-si connection requirement 10k dmtf compliant bmc device ref_clk crs_dv rxd_0 rxd_1 tx_en txd_0 txd_1 50 mhz reference clock buffer 50 mhz 10k 10k 3.3v 33 33 22 22 10k 10k 10k 10k 82756 nc-si interface signals ncsi_clk_in (b5) ncsi_crs_dv (a4) ncsi_rxd_0 (b7) ncsi_rxd_1 (a6) ncsi_tx_en (b6) ncsi_txd_0 (b8) ncsi_txd_1 (a7)
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 911 the nc-si signaling interface is a single-ended signaling environment with a target board and trace impedance of 50 ?? plus 20% and minus 10% is recommended. this target impedance ensures optimal signal integrity and signal quality. 12.3.4.4.2 trace length restrictions intel recommends a trace length maximum value from a board placement and routing topology perspective of eight inches for direct connect applications ( figure 12-4 ). this ensures that signal integrity and quality is preserved from a design perspective and that compliance is met for the nc-si electrical requirements. for multi-drop applications ( figure 12-5 ) the spacing recommendation is a maximum of four inches. this keeps the overall length between the mc and the 82576 within the specification. figure 12-4. nc-si trace length requirement for direct connect
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 912 12.3.5 power supplies for the intel ? 82576eb gbe controller the intel? 82576 gbe controller gigabit ethernet controllers require three power rails: 3.3 v, 1.8 v and 1.0 v. a central power supply can provide all the required voltage sources, or the power can be derived from the 3.3 v supply and regulated locally using external regulators. if the lan wake capability will be used, all voltages must remain present during system power down. local regulation of the lan voltages from system 3.3 vmain and 3.3 vaux voltages is recommended. external voltage regulators need to generate the proper voltage, supply current requirements (with adequate margin), and provide the proper power sequencing. due to the current demand, a switching voltage regulator (svr) is highly recommended for the 1.0 v power rail. figure 12-6 shows an example of a compact, low-part count, svr that can be used for both the 1.0 v and 1.8 v power supplies. the 1.8 v rail has a lower current requirement; however, the use of a svr is still recommended for adequate margin. using an lvr in this application is acceptable as long as adequate margin exists in the design and sequencing can be controlled. figure 12-7 shows an example of a compact low-part - count lvr that could be used for the 1.8 v supply. figure 12-5. nc-si trace length requirement for multi-drop
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 913 figure 12-6. example switching voltage regulator for 1.0 v and 1.8 v d n g g n i h c t i w s _ 0 v 1 c c v 3 v 3 c c v > > n o i t c e l e s r o t s i s e r r o t a l u g e r < < ) ) t f e l r + t h g i r r ( / p u r + 1 ( * 8 . 0 = t u o v ) t f e l r + t h g i r r ( * ) 1 - ) 8 . 0 / t u o v ( ( = p u r m h o k 9 . 9 4 = k 5 2 . 6 4 = ) k 5 7 + k 0 1 1 ( * ) 1 - ) 8 . 0 / 0 . 1 ( ( = p u r r o t a l u g e r g n i h c t i w s n w o d - p e t s v 0 . 1 > - v 3 . 3 r 5 x r 7 x r 5 x r 5 x r 5 x r 7 x ) a 5 . 2 ( v 0 . 1 = t u o v g n i h c t i w s _ 0 v 1 c c v r e t f a e s i r g n i h c t i w s _ 8 v 1 c c v 4 0 1 r k 0 0 1 y 4 0 1 r k 0 0 1 y 1 l u 1 y 1 l u 1 y 1 9 r m 1 1 . 5 y 1 9 r m 1 1 . 5 y 1 6 c u 2 2 y 1 6 c u 2 2 y 7 0 1 r k 9 . 9 4 y 7 0 1 r k 9 . 9 4 y 9 9 c p 7 4 y 9 9 c p 7 4 y 5 0 1 r k 0 1 y 5 0 1 r k 0 1 y 2 6 c u 2 2 y 2 6 c u 2 2 y p 0 6 5 0 0 1 c y p 0 6 5 0 0 1 c y 1 0 1 c p 2 2 y 1 0 1 c p 2 2 y 8 0 1 r k 5 7 y 8 0 1 r k 5 7 y 5 5 c u 2 2 y 5 5 c u 2 2 y n i v s 1 d o o g p 2 h t i 3 b f v 4 t r 5 s s / n u r 7 e d o m / c n y s 6 d n g s 8 1 n i v p 9 e_pad 17 1 w s 0 1 2 w s 1 1 1 d n g p 2 1 2 d n g p 3 1 4 w s 5 1 3 w s 4 1 2 n i v p 6 1 8 u y 8 u y 6 0 1 r k 0 1 1 y 6 0 1 r k 0 1 1 y 4 8 c p 0 7 4 y 4 8 c p 0 7 4 y 3 6 c u 2 2 y 3 6 c u 2 2 y 4 9 r k 1 0 3 y 4 9 r k 1 0 3 y d n g g n i h c t i w s _ 8 v 1 c c v 3 v 3 c c v r o t a l u g e r g n i h c t i w s n w o d - p e t s v 8 . 1 > - v 3 . 3 > > n o i t c e l e s r o t s i s e r r o t a l u g e r < < ) ) t f e l r + t h g i r r ( / p u r + 1 ( * 8 . 0 = t u o v ) t f e l r + t h g i r r ( * ) 1 - ) 8 . 0 / t u o v ( ( = p u r m h o k 2 3 2 = k 5 2 . 1 3 2 = ) k 5 7 + k 0 1 1 ( * ) 1 - ) 8 . 0 / 8 . 1 ( ( = p u r r 7 x r 7 x r 5 x r 5 x r 5 x r 5 x ) a 5 . 2 ( v 8 . 1 = t u o v 6 6 c u 2 2 y 6 6 c u 2 2 y 2 1 1 r k 2 3 2 y 2 1 1 r k 2 3 2 y 4 0 1 c p 2 2 y 4 0 1 c p 2 2 y 2 0 1 c p 7 4 y 2 0 1 c p 7 4 y 0 1 1 r k 0 1 y 0 1 1 r k 0 1 y p 0 6 5 3 0 1 c y p 0 6 5 3 0 1 c y 1 1 1 r k 0 1 1 y 1 1 1 r k 0 1 1 y 5 6 c u 2 2 y 5 6 c u 2 2 y 2 l u 1 y 2 l u 1 y 3 1 1 r k 5 7 y 3 1 1 r k 5 7 y 2 9 r m 1 y 2 9 r m 1 y 4 6 c u 2 2 y 4 6 c u 2 2 y 9 0 1 r k 0 0 1 y 9 0 1 r k 0 0 1 y 5 8 c p 0 7 4 y 5 8 c p 0 7 4 y 6 5 c u 2 2 y 6 5 c u 2 2 y n i v s 1 d o o g p 2 h t i 3 b f v 4 t r 5 s s / n u r 7 e d o m / c n y s 6 d n g s 8 1 n i v p 9 e_pad 17 1 w s 0 1 2 w s 1 1 1 d n g p 2 1 2 d n g p 3 1 4 w s 5 1 3 w s 4 1 2 n i v p 6 1 9 u y 9 u y 5 9 r k 1 0 3 y 5 9 r k 1 0 3 y
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 914 12.3.5.1 power sequencing regardless of which type of regulator used, all regulators need to adhere to the sequencing shown in figure 12-8 to avoid latch-up and forward-biased internal diodes. figure 12-7. example of linear voltage regulator for 1.8 v power rail r a e n i l _ 0 v 1 c c v d n g 8 v 1 c c v _ 3 v 3 c c v r a e n i l _ 8 v 1 c c v 3 v 3 c c v d n g d n g e b d l u o h s m h o k 5 < r 5 x r 5 x r 5 x r o t a l u g e r r a e n i l v 0 . 1 > - v 8 . 1 > > n o i t c e l e s r o t s i s e r r o t a l u g e r < < ) ) n w o d r / p u r ( + 1 ( * b f v = t u o v n w o d r * ) 1 - ) b f v / t u o v ( ( = p u r m h o k 1 = k 1 * ) 1 - ) 5 . 0 / 0 . 1 ( ( = p u r ) a 2 ( v 0 . 1 = t u o v r o t a l u g e r r a e n i l v 8 . 1 > - v 3 . 3 > > n o i t c e l e s r o t s i s e r r o t a l u g e r < < ) n w o d r * j d a i ( + ) ) p u r / n w o d r ( + 1 ( * b f v = t u o v : m h o 0 0 2 = n w o d r n e h w m h o 0 7 4 ~ = 4 6 4 = ) 1 - ) 5 2 . 1 / ) ) 0 0 2 * u 5 5 ( - 8 . 1 ( ( ( / ) 0 0 2 ( = p u r v 8 . 1 = t u o v 1206 6 0 2 1 t u p n i v 3 . 3 r o f b f v e g n a h c s r o t s i s e r m h o k 2 . 1 o t e g n a h c r o f 9 2 c u 2 2 y 9 2 c u 2 2 y 5 4 r k 1 y 5 4 r k 1 y 9 6 r 0 0 2 y 9 6 r 0 0 2 y 6 4 r k 1 y 6 4 r k 1 y 7 6 r 0 n 7 6 r 0 n 9 3 c u 0 1 y 9 3 c u 0 1 y 0 3 c u 0 1 y 0 3 c u 0 1 y 7 4 r k 0 0 1 y 7 4 r k 0 0 1 y 0 8 r 0 n 0 8 r 0 n 8 3 c u 0 1 y 8 3 c u 0 1 y 7 4 c u 2 . 2 y 7 4 c u 2 . 2 y 8 6 r 0 y 8 6 r 0 y vout 2 adj 1 vin 3 vout 4 1 u y 1 u y 0 7 r 0 7 4 y 0 7 r 0 7 4 y 7 7 r 0 y 7 7 r 0 y 1 n i 2 2 n i 3 3 n i 4 4 n i 5 n e 1 1 t u o 0 1 2 t u o 1 1 3 t u o 2 1 4 t u o 3 1 b f 9 k o p 6 gnd 8 2 p t 4 1 1 p t 7 e_pad 15 3 u 3 u   
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 915 in addition, the following limitations exist: ? 1.8 v must not exceed 3.3 v. ? 1.0 v must not exceed 3.3 v. ? 1.0 v must not exceed 1.8 v. the power supplies are all expected to ramp during a short power-up internal (approximately 20ms or better). do not leave the device in a prolonged state were some, but not all, voltages are applied. 12.3.5.1.1 using regulators with enable pins the use of regulators with enable pins is very helpful in controlling sequencing. connecting the enable of the 1.8 v regulator to 3.3 vwill allow the 1.8 v to ramp. connecting the enable of the 1.0 v regulator to the 1.8 v output assures that the 1.0 v rail will ramp after the 1.8 v rail. this provides a quick solution to power sequencing. make sure to check design parameters for inputs with this configuration. 12.3.5.2 device power supply filtering provide several high-frequency bypass capacitors for each power rail (see table 12-3 ), selecting values in the range of 0.01 f to 0.1 f. if possible, orient the capacitors close to the device and adjacent to power pads. decoupling capacitors should connect to the power planes with short, thick (18 mils or more) traces and 14 mil vias. long and thin traces are more inductive and would reduce the intended effect of decoupling capacitors. furnish approximately 4.7 f to 10 f of bulk capacitance for all the power rails; placement should be as close to the device power connection as possible. a figure 12-8. proper power sequencing table 12-3. minimum number of bypass capacitors per power rail power rail 4.7uf or 10uf 0.1uf 3.3 v 1 2 1.8 v 1 4 1.0 v 1 6 t v 3.3v 1.8v 1.0v
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 916 12.3.5.3 power management and wake up the device supports low power operation as defined in the pci bus power management specification. there are two defined power states, d0 and d3. the d0 state provides full power operation and is divided into two sub-states: d0u (uninitialized) and d0a (active). the d3 state provides low power operation and is also divided into two sub-states: d3hot and d3cold. to enter the low power state (d3), the software driver must stop data transmission and reception. either the operating system or the driver must program the power management control/status register (pmcsr) and the wakeup control register (wuc). if wakeup is desired, the appropriate wakeup lan address filters must also be set. the initial power management settings are specified by eeprom bits. when the device transitions to either of the d3 low power states, the 1.0 v, 1.8 v, and 3.3 v sources must continue to be supplied to the device. otherwise, it will not be possible to use a wakeup mechanism. the aux_pwr signal is a logic input to the 82576 that denotes auxiliary power is available. if aux_pwr is asserted, the 82576 will advertise that it supports wake up from a d3cold state. the 82576 supports both advanced power management (apm) wakeup and advanced configuration and power interface (acpi) wakeup. apm wakeup has also been known in the past as ?wake on lan? and as ?magic packet wake-up?. wakeup uses the pe_wake_n signal to wake the system up. pe_wake_n is an active low signal typically connected to a gpio port on the chipset that goes active in response to receiving a ?magic packet?, a network wakeup packet, or link status change indication. pe_wake_n remains asserted until pme status is cleared in the power management control/status register. for more information on power management, refer to chapter 5.0 12.3.6 device test capability the 82576 contains a test access port (3.3 v only) conforming to the ieee 1149.1a-1994 (jtag) boundary scan specification. to use the test access port, connect these balls to pads accessible by your test equipment. a bsdl (boundary scan definition language) file describing the intel ? 82576eb gbe controller device is available for use in your test environment. 12.3.7 software-definable pins (sdps) the 82576 has four software-defined pins (sdp) per port that can be used for hardware or software- programmable purposes. the pins are bound to a specific lan device (eight sdps may not be associated with a single lan device). the pins can be individually configured to act as either input or output pins. the default direction is configurable eeprom, as well as the default value of any pins configured as outputs. note: to avoid signal contention, the programmable pins are set as input pins until after the eeprom configuration has been loaded. for more information on the sdps refer to section 3.1.1 .
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 917 12.4 frequency control device design considerations this section provides information regarding frequency control devices, including crystals and oscillators. several suitable frequency control devices are available; none present any unusual challenges in selection. the concepts documented here are applicable to other data communication circuits, including phys. intel ethernet controllers contain amplifiers, which when used with the specific external components, form the basis for feedback oscillators. oscillator circuits, which are both economical and reliable, are described in detail in section 12.5 . intel ethernet controllers also have bus clock input functionality. this functionality is not discussed in this document. the chosen frequency control device vendor should be consulted early in the design cycle. 12.4.1 frequency control component types several types of frequency reference components are marketed. a discussion of each follows, listed in preferred order. 12.4.1.1 quartz crystal quartz crystals are the mainstay of frequency control components due to low cost and ease of implementation. they are available from numerous vendors in many package types with various specification options. 12.4.1.2 fixed crystal oscillator a packaged fixed crystal oscillator is comprised of an inverter, a quartz crystal, and passive components packaged together. the device renders a strong, consistent square wave output. oscillators used with microprocessors are supplied in many configurations. crystal oscillators should be restricted to use in special situations, such as shared clocking among devices or multiple controllers. as clock routing can be difficult to accomplish, it is preferable to provide a separate crystal for each device. for intel ethernet controllers, it is acceptable to overdrive the internal inverter by connecting a 25mhz external oscillator to the xtal1 lead, leaving the xtal2 lead unconnected. the oscillator should be specified to drive cmos logic levels and the clock trace to the device should be as short as possible. device specifications typically call for a 40% (minimum) to 60% (maximum) duty cycle and a 50 ppm frequency tolerance. note: please contact your intel customer representative to obtain the most current device documentation prior to implementing this solution. 12.4.1.3 programmable crystal oscillators a programmable oscillator can be configured to operate at many frequencies. the device contains a crystal frequency reference and a phase lock loop (pll) clock generator.
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 918 a programmable oscillator?s accuracy depends heavily on the ethernet device?s differential transmit lines. the physical layer (phy) uses the clock input from the device to drive a differential manchester (for 10 mbps operation), an mlt-3 (for 100 mbps operation) or a pam-5 (for 1000 mbps operation) encoded analog signal across the twisted pair cable. these signals are referred to as self-clocking, which means the clock must be recovered at the receiving link partner. clock recovery is performed with another pll that locks onto the signal at the other end. plls are prone to exhibit frequency jitter. the transmitted signal can also have considerable jitter even with the programmable oscillator working within its specified frequency tolerance. plls must be designed carefully to lock onto signals over a reasonable frequency range. if the transmitted signal has high jitter and the receiver?s pll loses its lock, then bit errors or link loss can occur. phy devices are deployed for many different communication applications. some phys contain plls with marginal lock range and cannot tolerate the jitter inherent in data transmission clocked with a programmable oscillator. the american national standards institute (ansi) x3.263-1995 standard test method for transmit jitter is not stringent enough to predict pll-to-pll lock failures, therefore, the use of programmable oscillators is generally not recommended. 12.4.1.4 ceramic resonator similar to a quartz crystal, a ceramic resonator is a piezoelectric device. a ceramic resonator typically carries a frequency tolerance of 0.5%, inadequate for use with intel ethernet controllers and should not be utilized. 12.5 crystal selection parameters all crystals used with intel ethernet controllers are described as ?at-cut,? which refers to the angle at which the unit is sliced with respect to the long axis of the quartz stone. table 12-4 lists crystals which have been used successfully in other designs. 12.5.1 vibrational mode crystals in the above frequency range are available in both fundamental and third overtone. unless there is a special need for third overtone, use fundamental mode crystals. at any operating frequency, third overtone crystals are thicker and more rugged than fundamental mode crystals. third overtone crystals are more suitable for use in military or harsh industrial environments. third overtone crystals require a trap circuit (extra capacitor and inductor) in the load circuitry to suppress fundamental mode oscillation as the circuit powers up. selecting values for these components is beyond the scope of this document. table 12-4. crystal manufacturers and part numbers manufacturer part no. raltron as-25.000-20-f-smd-t citizen america corp hcm4925.000mbbktr ndk america inc 41cd25.0s11005020 txc corporation - usa 9c25000131
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 919 12.5.2 nominal frequency intel ethernet controllers use a crystal frequency of 25.000 mhz. the 25 mhz input is used to generate a 125 mhz transmit clock for 100base-tx and 1000base-tx operation; 10 mhz and 20 mhz transmit clocks for 10base-t operation. 12.5.3 frequency tolerance the frequency tolerance for an ethernet platform lan connect is dictated by the ieee 802.3 specification as 50 parts per million (ppm). this measurement is referenced to a standard temperature of 25 c. intel recommends a frequency tolerance of 30 ppm. 12.5.4 temperature stability and environmental requirements temperature stability is a standard measure of how the oscillation frequency varies over the full operational temperature range (and beyond). several optional temperature ranges are currently available, including -40 c to +85 c for industrial environments. some vendors separate operating temperatures from temperature stability. manufacturers may also list temperature stability as 50 ppm in their data sheets. note: crystals also carry other specifications for storage temperature, shock resistance, and reflow solder conditions. crystal vendors should be consulted early in the design cycle to discuss the application and its environmental requirements. 12.5.5 calibration mode the terms ?series-resonant? and ?parallel-resonant? are used to describe crystal oscillator circuits. specifying parallel mode is critical to determining how the crystal frequency is calibrated at the factory. a crystal specified and tested as series resonant oscillates without problem in a parallel-resonant circuit, but the frequency is higher than nominal by several hundred parts per million. the purpose of adding load capacitors to a crystal oscillator circuit is to establish resonance at a frequency higher than the crystal?s inherent series resonant frequency. figure 12-9 illustrates a simplified schematic of the internal oscillator circuit. pin x1 and x2 refers to xtal1 and xtal2 in the ethernet device, respectively. the crystal and the capacitors form a feedback element for the internal inverting amplifier. this combination is called parallel-resonant, because it has positive reactance at the selected frequency. in other words, the crystal behaves like an inductor in a parallel lc circuit. oscillators with piezoelectric feedback elements are also known as ?pierce? oscillators.
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 920 12.5.6 load capacitance the formula for crystal load capacitance is as follows: where c1 = c2 = 27 pf and c stray = allowance for additional capacitance in pads, traces and the chip carrier within the ethernet device package an allowance of 3 pf to 7 pf accounts for lumped stray capacitance. the calculated load capacitance is 16 pf with an estimated stray capacitance of about 5 pf. individual stray capacitance components can be estimated and added. for example, surface mount pads for the load capacitors add approximately 2.5 pf in parallel to each capacitor. this technique is especially useful if y1, c1 and c2 must be placed farther than approximately one-half (0.5) inch from the device. thin circuit boards generally have higher stray capacitance than thick circuit boards. consult the pcie design guide for more information. oscillator frequency should be measured with a precision frequency counter where possible. the load specification or values of c1 and c2 should be fine tuned for the design. as the actual capacitance load increases, the oscillator frequency decreases. note: c1 and c2 may vary by as much as 5% (approximately 1 pf) from their nominal values. 12.5.7 shunt capacitance the shunt capacitance parameter is relatively unimportant compared to load capacitance. shunt capacitance represents the effect of the crystal?s mechanical holder and contacts. the shunt capacitance should equal a maximum of 7 pf. * figure 12-9. internal oscillator circuit c l c 1 c 2 ? ?? c 1 c 2 + ?? ------------------ - c stray + =
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 921 12.5.8 equivalent series resistance equivalent series resistance (esr) is the real component of the crystal?s impedance at the calibration frequency, which the inverting amplifier?s loop gain must overcome. esr varies inversely with frequency for a given crystal family. the lower the esr, the faster the crystal starts up. use crystals with an esr value of 50 ? or better. 12.5.9 drive level drive level refers to power dissipation in use. the allowable drive level for a surface mounted technology (smt) crystal is less than its through-hole counterpart, because surface mount crystals are typically made from narrow, rectangular at strips, rather than circular at quartz blanks. some crystal data sheets list crystals with a maximum drive level of 1 mw. however, intel ethernet controllers drive crystals to a level less than the suggested 0.5 mw value. this parameter does not have much value for on-chip oscillator use. 12.5.10 aging aging is a permanent change in frequency (and resistance) occurring over time. this parameter is most important in its first year because new crystals age faster than old crystals. use crystals with a maximum of 5 ppm per year aging. 12.5.11 reference crystal the normal tolerances of the discrete crystal components can contribute to small frequency offsets with respect to the target center frequency. to minimize the risk of tolerance-caused frequency offsets causing a small percentage of production line units to be outside of the acceptable frequency range, it is important to account for those shifts while empirically determining the proper values for the discrete loading capacitors, c1 and c2. even with a perfect support circuit, most crystals will oscillate slightly higher or slightly lower than the exact center of the target frequency. therefore, frequency measurements (which determine the correct value for c1 and c2) should be performed with an ideal reference crystal. when the capacitive load is exactly equal to the crystal?s load rating, an ideal reference crystal will be perfectly centered at the desired target frequency. 12.5.11.1 reference crystal selection there are several methods available for choosing the appropriate reference crystal. these are listed below. ? if a saunders and associates (s&a) crystal network analyzer is available, then discrete crystal components can be tested until one is found with zero or nearly zero ppm deviation (with the appropriate capacitive load). a crystal with zero or near zero ppm deviation will be a good reference crystal to use in subsequent frequency tests to determine the best values for c1 and c2. ? if a crystal analyzer is not available, then the selection of a reference crystal can be done by measuring a statistically valid sample population of crystals, which has units from multiple lots and approved vendors. the crystal, which has an oscillation frequency closest to the center of the distribution, should be the reference crystal used during testing to determine the best values for c1 and c2.
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 922 ? it may also be possible to ask the approved crystal vendors or manufacturers to provide a reference crystal with zero or nearly zero deviation from the specified frequency when it has the specified cload capacitance. when choosing a crystal, keep in mind that to comply with ieee specifications for 10/100 and 10/100/ 1000base-t ethernet lan, the transmitter reference frequency must be precise within ? 50 ppm. intel? recommends using a transmitter reference frequency that is accurate to within ? 30 ppm to account for variations in crystal accuracy due to crystal manufacturing tolerance. 12.5.11.2 circuit board since dielectric layers of the circuit board are allowed some reasonable variation in thickness, stray capacitance from the printed board (to the crystal circuit) will also vary. if the thickness tolerance for the outer layers of dielectric are controlled within 17 percent of nominal, then the circuit board should not cause more than 2 pf variation to the stray capacitance at the crystal. when tuning crystal frequency, it is recommended that at least three circuit boards are tested for frequency. these boards should be from different production lots of bare circuit boards. alternatively, a larger sample population of circuit boards can be used. a larger population will increase the probability of obtaining the full range of possible variations in dielectric thickness and the full range of variation in stray capacitance. next, the exact same crystal and discrete load capacitors (c1 and c2) must be soldered onto each board and the lan reference frequency should be measured on each circuit board. the circuit board, which has a lan reference frequency closest to the center of the frequency distribution, should be used while performing the frequency measurements to select the appropriate value for c1 and c2. 12.5.11.3 temperature changes temperature changes can cause crystal frequency to shift. frequency measurements should be done in the final system chassis across the system?s rated operating temperature range. 12.6 oscillator support the 82576 clock input circuit is optimized for use with an external crystal. however, an oscillator can also be used in place of the crystal with the proper design considerations: ? the clock oscillator has an internal voltage regulator of 1.2 v to isolate it from the external noise, to minimize jitter. if an external clock is used, this imposes a maximum input clock amplitude of 1.2 v. ? the input capacitance introduced by the 82576 (approximately 20 pf) is greater than the capacitance specified by a typical oscillator (approximately 15 pf). ? the input clock jitter from the oscillator can impact the 82576 clock and its performance.
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 923 note: the power consumption of additional circuitry equals about 1.5 mw. table 12-6 lists oscillators that have been used successfully in past designs (please note that no particular product is recommended): 12.6.1 oscillator solution this solution involves capacitor c1, which forms a capacitor divider with c stray of about 20 pf. this attenuates the input clock amplitude and adjusts the clock oscillator load capacitance. v in = vdd * (c1/(c1 + c stray )) v in = 3.3 * (c1/(c1 + c stray )) this enables load clock oscillators of 15 pf to be used. if the value of c stray is unknown, c1 should be adjusted by tuning the input clock amplitude to approximately 1 v ptp . if c stray equals 20 pf, then c1 is 10 pf 10%. a low capacitance, high impedance probe (c < 1 pf, r > 500 k ? ) should be used for testing. probing the parameters can affect the measurement of the clock amplitude and cause errors in the adjustment. a test should also be done after the probe has been removed for circuit operation. if jitter performance is poor, a lower jitter clock oscillator can be implemented. table 12-5. intel? 82576 gbe controller clock oscillator specifications symbol parameter specifications units min typical max f0 frequency (@25c) - 25 - mhz vp-p external oscillator supply swing 3.0 3.3 3.6 v vscp-p xtal1 swing 1.1 1.2 1.3 v d f/f o frequency tolerance (@ -20 to +70 c) - 50 ppm topr operating temperature - -20 to +70 c ? f/f o aging - 5 ppm/year ccoupling coupling capacitor 8 10 12 pf table 12-6. oscillator manufacturers and part numbers manufacturer part no. raltron co4305-25.000-tr citizen america corp csx750fbb25.000mtr
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 924 12.7 ethernet component layout guidelines these sections provide recommendations for performing printed circuit board layouts. good layout practices are essential to meet ieee phy conformance specifications and emi regulatory requirements. 12.7.1 layout considerations critical signal traces should be kept as short as possible to decrease the likelihood of being affected by high frequency noise from other signals, including noise carried on power and ground planes. keeping the traces as short as possible can also reduce capacitive loading. since the transmission line medium extends onto the printed circuit board, special attention must be paid to layout and routing of the differential signal pairs. designing for 1000 base-t gigabit operation is very similar to designing for 10 and 100 mbps. system level tests should be performed at all three speeds. 12.7.1.1 guidelines for component placement component placement can affect signal quality, emissions, and component operating temperature this section provides guidelines for component placement. careful component placement can: ? decrease potential problems directly related to electromagnetic interference (emi), which could cause failure to meet applicable government test specifications. ? simplify the task of routing traces. to some extent, component orientation will affect the complexity of trace routing. the overall objective is to minimize turns and crossovers between traces. minimizing the amount of space needed for the ethernet lan interface is important because other interfaces will compete for physical space on a motherboard near the connector. the ethernet lan circuits need to be as close as possible to the connector. figure 12-11 shows some basic placement distance guidelines. figure 12-11 shows two differential pairs, but can be generalized for a gigabit system with four analog pairs. the ideal placement for the ethernet silicon would be approximately one inch behind the magnetics module. figure 12-10. reference oscillator circuit
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 925 while it is generally a good idea to minimize lengths and distances, figure 12-11 also illustrates the need to keep the lan silicon away from the edge of the board and the magnetics module for best emi performance. figure 12-12 and figure 12-13 illustrate a reference layout for discrete and integrated magnetics. figure 12-11. general placement distances for 1000 base-t designs note: this figure represents a 10/100 diagram. use the same design considerations for the two differential pairs not shown for gigabit implementations.
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 926 figure 12-12. layout for integrated magnetics figure 12-13. layout for discrete magnetics
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 927 12.7.1.2 crystals and oscillators clock sources should not be placed near i/o ports or board edges. radiation from these devices may be coupled into the i/o ports and radiate beyond the system chassis. crystals should also be kept away from the ethernet magnetics module to prevent interference. 12.7.1.2.1 crystal layout considerations note: failure to follow these guidelines could result in the 25 mhz clock failing to start. when designing the layout for the crystal circuit, use the following rules: ? place load capacitors as close as possible (within design-for-manufacturability rules) to crystal solder pads. they should be no more than 90 mils away from crystal pads. ? the two load capacitors, crystal component, the ethernet controller device, and the crystal circuit traces must all be located on the same side of the circuit board (maximum of one via-to-ground load capacitor on each xtal trace). ? use 27 pf (5% tolerance) 0402 load capacitors. ? place load capacitor solder pad directly in line with circuit trace (see figure 12-14 , point a). ? place a 30-ohm (5% tolerance) 0402 series resistor on xtal2 (see figure 12-14 , point c). the location of the resistor along the xtal2 trace is flexible, as long as it is between the load capacitor and the controller. ? use 50-ohm impedance single-ended microstrip traces for the crystal circuit. ? route traces so that electro-magnetic fields from xtal2 do not couple onto xtal1. no differential traces. ? route xtal1 and xtal2 traces to nearest inside corners of crystal pad (see figure 12-14 , point b). ? ensure that the traces from xtal1 and xtal2 are symmetrically routed and that their lengths are matched. ? the total trace length of xtal1 or xtal2 should be less than 750 mils.
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 928 12.7.1.3 board stack up recommendations printed circuit boards for these designs typically have six, eight, or more layers. although, the 82576 does not dictate the stackup, here is an example of a typical six-layer board stackup: ? layer 1 is a signal layer. it can contain the differential analog pairs from the ethernet device to the magnetics module, or to an optical transceiver. ? layer 2 is a signal ground layer. chassis ground may also be fabricated in layer 2 under the connector side of the magnetics module. ? layer 3 is used for power planes. ? layer 4 is a signal layer. ? layer 5 is an additional ground layer. ? layer 6 is a signal layer. for 1000 base-t (copper) gigabit designs, it is common to route two of the differential pairs (per port) on this layer. this configuration can be adjusted to conform to your company's rules 12.7.1.4 differential pair trace routing for 10/100/1000 designs trace routing considerations are important to minimize the effects of crosstalk and propagation delays on sections of the board where high-speed signals exist. signal traces should be kept as short as possible to decrease interference from other signals, including those propagated through power and ground planes. observe the following suggestions to help optimize board performance: figure 12-14. recommended crystal placement and layout
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 929 ? maintain constant symmetry and spacing between the traces within a differential pair. ? minimize the difference in signal trace lengths of a differential pair. ? keep the total length of each differential pair under 4 inches. although possible, designs with differential traces longer than 5 inches are much more likely to have degraded receive ber (bit error rate) performance, ieee phy conformance failures, and/or excessive emi (electromagnetic interference) radiation. ? do not route a pair of differential traces closer than 100 mils to another differential pair. ? do not route any other signal traces parallel to the differential traces, and closer than 100 mils to the differential traces (300 mils is recommended). ? keep maximum separation within differential pairs to 7 mils. ? for high-speed signals, the number of corners and vias should be kept to a minimum. if a 90 bend is required, it is recommended to use two 45 bends instead. see figure 12-15 . note: in manufacturing, vias are required for testing and troubleshooting purposes. the via size should be a 17-mil (2 mils for manufacturing variance) finished hole size (fhs). ? traces should be routed away from board edges by a distance greater than the trace height above the reference plane. this allows the field around the trace to couple more easily to the ground plane rather than to adjacent wires or boards. ? do not route traces and vias under crystals or oscillators. this will prevent coupling to or from the clock. and as a general rule, place traces from clocks and drives at a minimum distance from apertures by a distance that is greater than the largest aperture dimension. ? the reference plane for the differential pairs should be continuous and low impedance. it is recommended that the reference plane be either ground or 1.8 v (the voltage used by the phy). this provides an adequate return path for and high frequency noise currents. ? do not route differential pairs over splits in the associated reference plane as it may cause discontinuity in impedances. 12.7.1.4.1 signal termination and coupling the four differential pairs of each port are terminated with 49.9 ? (1% tolerance) resistors, placed near the controller. one resistor connects to the mdi+ signal trace and another resistor connects to the mdi- signal trace. the opposite ends of the resistors connect together and to ground through a single 0.1 ? f capacitor. the capacitor should be placed as close as possible to the 49.9 ohm resistors, using a wide trace. stubs created by the 49.9 ? (1% tolerance) termination resistors should be kept at a minimum. do not vary the suggested component values. be sure to lay out symmetrical pads and traces for these components such that the length and symmetry of the differential pairs are not disturbed. figure 12-15. trace routing
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 930 12.7.1.5 signal trace geometry for 1000 base-t designs key factors in controlling trace emi radiation are the trace length and the ratio of trace-width to trace- height above the reference plane. to minimize trace inductance, high-speed signals and signal layers that are close to a reference or power plane should be as short and wide as practical. ideally, this trace width to height above the ground plane ratio is between 1:1 and 3:1. to maintain trace impedance, the width of the trace should be modified when changing from one board layer to another if the two layers are not equidistant from the neighboring planes. each pair of signals should have a differential impedance of 100 ? . +/- 15%. if a particular tool cannot design differential traces, it is permissible to specify 55-65 ? single-ended traces as long as the spacing between the two traces is minimized. as an example, consider a differential trace pair on layer 1 that is 8 mils (0.2 mm) wide and 2 mils (0.05 mm) thick, with a spacing of 8 mils (0.2 mm). if the fiberglass layer is 8 mils (0.2mm) thick with a dielectric constant, e r , of 4.7, the calculated single-ended impedance would be approximately 61 ? and the calculated differential impedance would be approximately 100 ? . when performing a board layout, do not allow the cad tool auto-router to route the differential pairs without intervention. in most cases, the differential pairs will have to be routed manually. note: measuring trace impedance for layout designs targeting 100 ? often results in lower actual impedance. designers should verify actual trace impedance and adjust the layout accordingly. if actual impedance is consistently low, a target of 105 ? 110 ? should compensate for second order effects. it is necessary to compensate for trace-to-trace edge coupling (which can lower the differential impedance by up to 10 ? ) when the traces within a pair are closer than 30 mils (edge to edge). 12.7.1.6 trace length and symmetry for 1000 base-t designs as indicated, the overall length of differential pairs should be less than four inches measured from the ethernet device to the magnetics. the differential traces (within each pair) should be equal in total length to within 50 mils (1.25mm) and as symmetrical as possible. asymmetrical and unequal length traces in the differential pairs contribute to common mode noise. if a choice has to be made between matching lengths and fixing symmetry, more emphasis should be placed on fixing symmetry. common mode noise can degrade the receive circuit?s performance and contribute to radiated emissions. 12.7.1.6.1 signal detect each port of the 82576 has a signal detect pin for connection to optical transceivers. for designs without optical transceivers, these signals can be left unconnected because they have internal pull-up resistors. signal detect is not a high-speed signal and does not require special layout. 12.7.1.7 routing 1.8 v to the magnetics center tap the central-tap 1.8 v should be delivered as a solid supply plane (1.8 v) directly to the magnetic module or, if this is not possible, by a short and thick trace (lower than 0.2ohm dc resistance). the decoupling capacitors for the central tap pins should be placed as close as possible to the magnetic component. this improves both emi and ieee compliance.
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 931 12.7.1.8 impedance discontinuities impedance discontinuities cause unwanted signal reflections. minimize vias (signal through holes) and other transmission line irregularities. if vias must be used, a reasonable budget is two per differential trace. unused pads and stub traces should also be avoided. 12.7.1.9 reducing circuit inductance traces should be routed over a continuous reference plane with no interruptions. if there are vacant areas on a reference or power plane, the signal conductors should not cross the vacant area. this causes impedance mismatches and associated radiated noise levels. noisy logic grounds should be separated from analog signal grounds to reduce coupling. noisy logic grounds can sometimes affect sensitive dc subsystems such as analog to digital conversion, operational amplifiers, etc. all ground vias should be connected to every ground plane; similarly, every power via, to all power planes at equal potential. this helps reduce circuit inductance. another recommendation is to physically locate grounds to minimize the loop area between a signal path and its return path. rise and fall times should be as slow as possible (because signals with fast rise and fall times contain many high frequency harmonics, which can radiate significantly). the most sensitive signal returns closest to the chassis ground should be connected together. this will result in a smaller loop area and reduce the likelihood of crosstalk. the effect of different configurations on the amount of crosstalk can be studied using electronics modeling software. 12.7.1.10 signal isolation to maintain best signal integrity, keep digital signals far away from analog traces. a good rule of thumb is no digital signal should be within 300 mils (7.5mm) of differential pairs. if digital signals on other board layers cannot be separated by a ground plane, they should be routed perpendicular to differential pairs. if there is another lan controller on the board, take care to keep differential pairs away from that circuit. rules follow for signal isolation: ? separate and group signals by function on separate layers if possible. maintain a gap of 100 mils between all differential pairs (ethernet) and other nets, but group associated differential pairs together. over the length of the trace run, each differential pair should be at least 0.3 inches away from any parallel signal traces. ? physically group components associated with one clock trace to reduce trace length and radiation. ? isolate i/o signals from high-speed signals to minimize crosstalk, which can increase emi emission and susceptibility to emi. ? avoid routing high-speed lan traces near other high-frequency signals associated with a video controller, cache controller, processor, or other similar devices. 12.7.1.11 power and ground planes good grounding requires minimizing inductance levels in the interconnections and keeping ground returns short, signal loop areas small, and power inputs bypassed to signal return, will significantly reduce emi radiation. these guidelines help reduce circuit inductance in backplanes and motherboards:
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 932 ? route traces over a continuous plane with no interruptions. do not route them over a split power or ground plane. if there are vacant areas on a ground or power plane, avoid routing signals over the vacant area. this will increase inductance and emi radiation levels. ? separate noisy digital grounds from analog grounds to reduce coupling. noisy digital grounds may affect sensitive dc subsystems. ? all ground vias should be connected to every ground plane; every power via should be connected to all power planes at equal potential. this reduces circuit inductance. ? physically locate grounds between a signal path and its return. this minimizes the loop area. ? avoid fast rise/fall times as much as possible. signals with fast rise and fall times contain many high frequency harmonics (which can radiate emi). ? the ground plane beneath a magnetics module should be split. the rj45 connector side of the transformer module should have a chassis ground beneath it. 12.7.1.12 traces for decoupling capacitors traces between decoupling and i/o filter capacitors should be as short and wide as practical. long and thin traces are more inductive and reduce the intended effect of decoupling capacitors. for similar reasons, traces to i/o signals and signal terminations should be as short as possible. vias to the decoupling capacitors should be sufficiently large in diameter to decrease series inductance. 12.7.1.13 light emitting diodes for designs based on the 82576 the 82576 provides four programmable high-current push-pull (active high) outputs per port to directly drive leds for link activity and speed indication. each lan device provides an independent set of led outputs; these pins and their function are bound to a specific lan device. each of the four led outputs can be individually configured to select the particular event, state, or activity, which will be indicated on that output. in addition, each led can be individually configured for output polarity, as well as for blinking versus non-blinking (steady-state) indication. since the leds are likely to be integral to a magnetics module, take care to route the led traces away from potential sources of emi noise. in some cases, it may be desirable to attach filter capacitors. the led ports are fully programmable through the eeprom interface. 12.7.1.14 thermal design considerations the intel? 82576 gbe controller contains a thermal sensor that is accessible through the smbus. trip points can be set in the eeprom for the device. icepak* and flowtherm* models are available for the intel ? 82576eb gbe controller; contact your intel representative for information. 12.7.2 physical layer conformance testing physical layer conformance testing (also known as ieee testing) is a fundamental capability for all companies with ethernet lan products. phy testing is the final determination that a layout has been performed successfully. if your company does not have the resources and equipment to perform these tests, consider contracting the tests to an outside facility. 12.7.2.1 conformance tests for 10/100/1000 mbps designs crucial tests are as follows, listed in priority order:
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 933 ? bit error rate (ber). good indicator of real world network performance. perform bit error rate testing with long and short cables and many link partners. the test limit is 10 -11 errors. ? output amplitude, rise and fall time (10/100mbps), symmetry and droop (1000mbps). for this controller, use the appropriate phy test waveform. ? return loss. indicator of proper impedance matching, measured through the rj-45 connector back toward the magnetics module. ? jitter test (10/100mbps) or unfiltered jitter test (1000mbps). indicator of clock recovery ability (master and slave for gigabit controller). 12.7.3 troubleshooting common physical layout issues the following is a list of common physical layer design and layout mistakes in lan on motherboard designs. 1. lack of symmetry between the two traces within a differential pair. asymmetry can create common-mode noise and distort the waveforms. for each component and/or via that one trace encounters, the other trace should encounter the same component or a via at the same distance from the ethernet silicon. 2. unequal length of the two traces within a differential pair. inequalities create common-mode noise and will distort the transmit or receive waveforms. 3. excessive distance between the ethernet silicon and the magnetics. long traces on an fr4 fiberglass epoxy substrate will attenuate the analog signals. in addition, any impedance mismatch in the traces will be aggravated if they are longer than the four inch guideline. 4. routing any other trace parallel to and close to one of the differential traces. crosstalk getting onto the receive channel will cause degraded long cable ber. crosstalk getting onto the transmit channel can cause excessive emi emissions and can cause poor transmit ber on long cables. at a minimum, other signals should be kept 0.3 inches from the differential traces. 5. routing one pair of differential traces too close to another pair of differential traces. after exiting the ethernet silicon, the trace pairs should be kept 0.3 inches or more away from the other trace pairs. the only possible exceptions are in the vicinities where the traces enter or exit the magnetics, the rj-45 connector, and the ethernet silicon. 6. use of a low-quality magnetics module. 7. re-use of an out-of-date physical layer schematic in a ethernet silicon design. the terminations and decoupling can be different from one phy to another. 8. incorrect differential trace impedances. it is important to have ~100 w impedance between the two traces within a differential pair. this becomes even more important as the differential traces become longer. to calculate differential impedance, many impedance calculators only multiply the single-ended impedance by two. this does not take into account edge-to-edge capacitive coupling between the two traces. when the two traces within a differential pair are kept close to each other, the edge coupling can lower the effective differential impedance by 5 w to 20 w. short traces will have fewer problems if the differential impedance is slightly off target. 12.8 serdes implementation this section clarifies serdes implementation. see also: serdes application note an498 .
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 934 12.8.1 connecting the serdes interface 12.8.2 output voltage adjustment serdes differential amplitude can be adjusted by eeprom or register modification. table 12-7. connecting the serdes interface signal name bx backplane connector sff(laser) transceiver sfp connector connection pu/pd connection pu/pd connection pu/pd srdsi_0_p x -- rdm -- rd+ -- srdsi_0_n x -- rdp -- rd- -- srdso_0_p x -- tdp -- td+ -- srdso_0_n x -- tdm -- td+ -- srds_0_sig_det pu sd -- pu sdp0_0 -- -- -- -- -- sdp0_1 1 -- -- -- txfault pu sdp0_2 2 -- -- tdis pd txdis pu sdp0_3 2 -- -- laser_pwr -- los pu sfp0_i2c_clk 3 -- -- -- -- mod-def1 pu sfp0_i2c_data -- -- -- -- mod-def2 pu srdsi_1_p x -- rdm -- rd+ -- srdsi_1_n x -- rdp -- rd- -- srdso_1_p x -- tdp -- td+ -- srdso_1_n x -- tdm -- td+ -- srds_1_sig_det pu sd -- pu sdp1_0 sdp1_1 1 -- -- -- -- txfault pu sdp1_2 1 -- -- tdis pd txdis pu sdp1_3 1 -- -- laser_pwr -- los pu sfp1_i2c_clk 3 -- -- -- -- mod-def1 pu sfp1_i2c_data -- -- -- -- mod-def2 pu notes: 1. sdp pins are software definable pins; however, use this implementation as it has been tested and verified with intel drivers. failure to follow these recommendations could cause interoperability issues. 2. pu value ? 10k. 3. the sfpn_i2c_clk does not allow for clock stretching.
design guidelines ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 935 ? eeprom ? to obtain an eeprom image that sets amplitude other then default, contact intel. ? using the register ? see 0x00024 serdesctl. 12.8.3 output voltage adjustment 12.9 thermal management see chapter 13.0, thermal design specifications. 12.10 reference schematics reference schematics (serdes\fiber\sfp and copper) are available as a separate document through intel documentation channels. 12.11 checklists the schematic checklist and the layout and placement checklist are available as a separate document through intel documentation channels. 12.12 symbols the cad model for this product is available from your intel representative. table 12-8. output voltage adjustment reg. number 0x00 0x34 diff amp [mv pk-pk] core 0 core 1 min reg. value 0x00 0x05 134 136 min amp -1step 0x8c 0x05 684 689 min amp ~750mv pk-pk 0x9c 0x05 741 750 min amp +1 0xac 0x05 797 806 default by eeprom 0xfc 0x05 1.072 1.06 default by eeprom +1step 0x0c 0x15 1.131 1.117 max reg. value 0xfc 0x15 1.738 1.74
intel ? 82576eb gbe controller ? design guidelines intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 936 note: this page intentionally left blank.
thermal design specifications ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 937 13.0 thermal design specifications 13.1 product package thermal specification the thermal parameters defined above are for reference only and based on a combination of empirical/ simulated results of packages assembled on a standard 4s4p 1.0-oz cu signal layer, 1.0-oz cu power/ ground layer board in a natural convection environment. ? ja is the package junction-to-air thermal resistance. ? jt is the junction-to-package top thermal characterization parameter. system designs may vary considerably from the typical jedec board environment used. package thermal models are available upon request (flotherm 2-resistor, delphi or detailed and icepak format). 13.2 introduction this chapter describes the thermal characteristics for the 82576. use this document to design a thermal solution for systems implementing the device. properly designed solutions provide adequate cooling to maintain the case temperature (tcase) at or below those listed in table 13-2 . ideally, this is accomplished by providing a low local ambient temperature and creating a minimal thermal resistance to that local ambient temperature. heat sinks may be required if case temperatures exceed those listed. operating outside of these operating limits may result in improper functionality or permanent damage to the intel component and potentially other components within the system. maintaining the proper thermal environment is essential to reliable, long-term component/system operation. table 13-1. package thermal characteristics in a standard system environment package type measured power (thermal design power*) ? ja ? jt 25mm 576fcbga5-4l no thermal solution 2.4 w 18.4 c/w 1.7 c/w 25mm 576fcbga5-4l with thermal solution 2.4 w 13.5 c/w 0.6 c/w * see section 13.5 for a definition of this parameter.
intel ? 82576eb gbe controller ? thermal design specifications intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 938 13.3 measuring the thermal conditions this chapter provides a method for determining the operating temperature of the device in a specific system based on case temperature. case temperature is a function of the local ambient and internal temperatures of the component. this document specifies a maximum allowable tcase for the device. 13.4 thermal considerations in a system environment, the temperature of a component is a function of both the system and component thermal characteristics. system-level thermal constraints consist of the local ambient temperature at the component, the airflow over the component and surrounding board, and the physical constraints at, above, and surrounding the component that may limit the size of a thermal enhancement (heat sink). the component?s case/die temperature depends on: ? component power dissipation ? size ? packaging materials (effective thermal conductivity) ? type of interconnection to the substrate and motherboard ? presence of a thermal cooling solution ? power density of the substrate, nearby components, and motherboard these parameters are pushed by the continued trend of technology to increase performance levels (higher operating speeds, mhz) and power density (more transistors). as operating frequencies increase and packaging size decreases, the power density increases and the thermal cooling solution space and airflow become more constrained. the result is an increased emphasis on system design to ensure that thermal design requirements are met for each component in the system. the thermal management objective is to ensure that all system component temperatures are maintained within functional limits. the functional temperature limit is the range in which the electrical circuits are expected to meet specified performance requirements. operation outside the functional limit can degrade system performance, cause logic errors, or cause component and/or system damage. temperatures exceeding the maximum operating limits may result in irreversible changes in the component operating characteristics. also note that sustained operation at component maximum temperature limit may affect long-term device reliability. 13.5 packaging terminology the following is a list of packaging terminology used in this document: ? fcbga flip chip ball grid array : a surface mount package using a combination of flip chip and bga structure whose pcb-interconnect method consists of eutectic solder ball array on the interconnect side of the package. the die is flipped and connected to an organic build-up substrate with c4 bumps. an integrated heat spreader (ihs) may be present for larger fcbga packages for enhanced thermal performance. ? junction : refers to a p-n junction on the silicon. in this document, it is used as a temperature reference point (for example, theta ja refers to the ?junction? to ?ambient? thermal resistance).
thermal design specifications ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 939 ? ambient : refers to local ambient temperature of the bulk air approaching the component. it can be measured by placing a thermocouple approximately 1? upstream from the component edge. ? lands : the pads on the pcb to which bga balls are soldered. ? pcb : printed circuit board. ? printed circuit assembly (pca) : an assembled pcb. ? thermal design power (tdp) : the estimated maximum possible/expected power generated in a component by a realistic application. use maximum power requirement numbers from table 13-1 . ? lfm: linear feet per minute (airflow). 13.6 thermal specifications to ensure proper operation and reliability of the device, the thermal solution must maintain a case temperature at or below the values specified in table 13-2 . system-level or component-level thermal enhancements are required to dissipate the generated heat if the case temperature exceeds the maximum temperatures listed in table 13-2 . analysis indicates that real applications are unlikely to cause the device to be at tcase-max for sustained periods of time. given that tcase should reasonably be expected to be a distribution of temperatures, sustained operation at tcase-max may be indicative that the given thermal solution will also result in situations where tcase exceeds the specified maximum value. such thermal designs may affect long-term reliability of the device and the system, and sustained performance at tcase-max should be evaluated during the thermal design process and steps taken to further reduce the tcase temperature. good system airflow is critical to dissipate the highest possible thermal power. the size and number of fans, vents, and/or ducts, and, their placement in relation to components and airflow channels within the system determine airflow. acoustic noise constraints may limit the size and types of fans, vents and ducts that can be used in a particular design. to develop a reliable, cost-effective thermal solution, all of the system variables must be considered. use system-level thermal characteristics and simulations to account for individual component thermal requirements. 13.6.1 case temperature the device is designed to operate properly as long as the tcase is not exceeded. see section 13.8.1 for guidelines . table 13-2. 82576 thermal absolute maximum rating parameter maximum tcase-hs 1 1. tcase-hs is defined as the maximum case temperature with the default enhanced thermal solution attached. 113c tcase-no hs 2 2. tcase-no hs is defined as the maximum case temperature without any thermal enhancement to the package. 111c
intel ? 82576eb gbe controller ? thermal design specifications intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 940 13.7 thermal attributes 13.7.1 designing for thermal performance the appendices of this document give the pcb and system design recommendations required to achieve the thermal performance documented herein. 13.7.2 typical system definitions the following system example is used to generate thermal characteristics data: ? heat sink case assumes the default enhanced thermal solution. see section 13.7.6 . ? evaluation board is a standard multi-layer 4s4p 1.0-oz cu signal layer, 1.0-oz power/ground layer pcb. ? data at 50lfm and 150lfm is validated against physical samples. your design may be different. a larger board size with more than six cu layers may increase thermal performance . 13.7.3 package thermal characteristics figure 13-1 shows the required local ambient temperature versus airflow for a typical system. thermal models are available upon request (flotherm: 2-resistor, delphi, or detailed and icepak: detailed). contact intel sales. figure 13-1. maximum allowable ambient temperature vs. airflow
thermal design specifications ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 941 table 13-3 shows tcase as a function of airflow and ambient temperature at the tdp for a typical system and aids in determining the optimum airflow and heat sink combination for the device. the following table shows tcase as a function of airflow and ambient temperature at the tdp for a typical system and aids in determining the optimum airflow for the device. note: the underlined values indicate airflow/local ambient combinations that exceed the allowable case temperature for the typical system. thermal enhancements (if required) are a method frequently used to improve thermal performance by increasing the component?s surface area by attaching a metallic heat sink to the component top. increasing the surface area of the heat sink reduces the thermal resistance from the heat sink to the air increasing heat transfer. table 13-3. expected tcase (c) for heat sink attached to tdp table 13-4. expected tcase (c) for no heat sink attached at tdp
intel ? 82576eb gbe controller ? thermal design specifications intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 942 13.7.4 clearance to be effective, a heat sink requires a pocket of air around it free of obstructions. though each design may have unique mechanical restrictions, recommended clearance zones for a heat sink used with the 82576ea/eb/es are in figure 13-2 and figure 13-3 . figure 13-2. 82576ea/eb/es heat sink volume restrictions: primary side figure 13-3. 82576ea/eb/es heat sink volume restrictions: secondary side
thermal design specifications ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 943 13.7.5 default enhanced thermal solution if you have no control over the end-user's thermal environment or if you wish to bypass the thermal modeling and evaluation process, use the default enhanced thermal solution (discussed in the following section). if the case temperature continues to exceed the appropriate value listed in table 13- 2 after implementing the default enhanced thermal solution, additional cooling is needed, see figure 13-1 . the thermal performance gain may be achieved by improving airflow to the component and/or adding additional thermal enhancements. 13.7.6 extruded heat sinks if required, the following extruded heat sink is the suggested thermal solution. figure 13-4 shows the heat sink drawing. for equivalent heat sinks and sources, see section 13.9 . 13.7.7 attaching the extruded heat sink the extruded heat sink may be attached using clips with a phase change thermal interface material. 13.7.7.1 clips a well-designed clip, in conjunction with a thermal interface material (tape, grease, etc.) often offers the best combination of mechanical stability and rework-ability. use of a clip requires significant advance planning as mounting holes are required in the pcb. use non-plated mounting with a grounded annular ring on the solder side of the board surrounding the hole. for a typical low-cost clip, set the annular ring inner diameter to 150 mils and an outer diameter to 300 mils. define the ring to have at least eight ground connections. set the solder mask opening for these holes with a radius of 300 mils. figure 13-4. 82576 extruded heat sink (in millimeters)
intel ? 82576eb gbe controller ? thermal design specifications intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 944 13.7.7.2 thermal interface material (pcm45f) the recommended thermal interface is pcm45f from honeywell. the pcm45f thermal interface pads are phase change materials formulated for use in high performance devices requiring minimum thermal resistance for maximum heat sink performance and component reliability. these pads consist of an electrically non-conductive, dry film that softens at device operating temperatures resulting in ?grease- like? performance. alternate recommended tim is pcm45f from honeywell for cost saving purposes. however, intel has not fully validated the pcm45f tim. following the manufacturers recommended attach procedure list for the recommended thermal interface. 1. ensure that the component surface and heat sink are free from contamination. using proper safety precaution, clean the package top with a lint-free wipe and isopropyl alcohol. 2. pre heat the heat sink to 50 c. remove the honeywell pcm45f from the carrier. for best result, peel the tim off of the carrier by peeling back the carrier at 180 degrees. 3. carefully align the pad, and place it on the heat sink. 4. apply 10 psi pressure to the pcm45f pad and let the heat sink cool to room temperature (25c). 5. remove top liner. peel back at 180 degrees to prevent voids and achieve best results. 6. dents and minor scratches in the material will not affect performance since the material is designed to flow at typical operating temperatures. honeywell pads can be removed for rework using a single-edged razor and then cleaning the surface with isopropyl (ipa) solvent. each pca, system and heat sink combination varies in attach strength. carefully evaluate the reliability of tape attaches prior to high-volume use. figure 13-5. pcm45f attach process (in roll form)
thermal design specifications ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 945 13.7.8 reliability each pca, system and heat sink combination varies in attach strength and long-term adhesive performance. carefully evaluate the reliability of the completed assembly prior to high-volume use. some reliability recommendations are shown in table 13-5 . 13.7.9 thermal interface management for heat-sink solutions to optimize the 82576 heat sink design, it is important to understand the interface between the exposed die and the heat sink base. specifically, thermal conductivity effectiveness depends on the following: ? bond line thickness ? interface material area ? interface material thermal conductivity figure 13-6. completing the attach process table 13-5. reliability validation test 1 1. perform the above tests on a sample size of at least 12 assemblies from 3 lots of material (total = 36 assemblies). requirement pass/fail criteria 2 2. additional pass/fail criteria can be added at your discretion. mechanical shock 50g, board level 11 ms trapezoidal pulse, 3 shocks/axis visual & electrical check random vibration 7.3g, board level 45 minutes/axis, 50 to 2000 hz visual & electrical check high-temperature life 85 o c 2000 hours total checkpoints occur at 168, 500, 1000, and 2000 hours visual & mechanical check thermal cycling per-target environment (for example: -40 o c to +85 o c) 500 cycles visual & mechanical check humidity 85% relative humidity 85 o c, 1000 hours visual & mechanical check
intel ? 82576eb gbe controller ? thermal design specifications intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 946 13.7.9.1 bond line management the gap between the exposed die and the heat sink base impacts the heat-sink solution performance. the larger the gap between the two surfaces, the greater the thermal resistance. the thickness of the gap is determined by the flatness of both the heat sink base and the exposed die, plus the thickness of the thermal interface material (for example, psa, thermal grease, epoxy) used to join the two surfaces. planarity of the 82576 package is 8 mils (in accordance with jedec specifications). 13.7.9.2 interface material performance the following two factors impact the performance of the interface material between the exposed die and the heat sink base: ? thermal resistance of the material ? wetting/filling characteristics of the material 13.7.9.2.1 thermal resistance of material thermal resistance describes the ability of the thermal interface material to transfer heat from one surface to another. the higher the thermal resistance, the less efficient is the heat transfer. the thermal resistance of the interface material has a significant impact on the thermal performance of the overall thermal solution. with a higher thermal resistance, there will be a larger temperature drop across the interface. 13.7.9.2.2 wetting/filling characteristics of material the wetting/filling characteristic of the thermal interface material is its ability to fill the gap between the exposed die top surface and the heat sink. since air is an extremely poor thermal conductor, the more completely the interface material fills the gaps, the lower the temperature-drop across the interface, increasing the efficiency of the thermal solution. 13.8 measurements for thermal specifications determining the thermal properties of the system requires careful case temperature measurements. guidelines for measuring the 82576 case temperature are provided in section 13.8.1 . 13.8.1 case temperature measurements maintain the 82576 tcase at or below the maximum case temperatures listed in table 13-2 to ensure functionality and reliability. special care is required when measuring the case temperature to ensure an accurate temperature measurement. use the following guidelines when making case measurements: ? measure the surface temperature of the case in the geometric center of the case top. ? calibrate the thermocouples used to measure tcase before making temperature measurements. ? use 36-gauge (maximum) k-type thermocouples.
thermal design specifications ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 947 care must be taken to avoid introducing errors into the measurements when measuring a surface temperature that is a different temperature from the surrounding local ambient air. measurement errors may be due to a poor thermal contact between the thermocouple junction and the surface of the package, heat loss by radiation, convection, conduction through thermocouple leads, and/or contact between the thermocouple cement and the heat-sink base (if used). 13.8.1.1 attaching the thermocouple (no heat sink) the following approach is recommended to minimize measurement errors for attaching the thermocouple with no heat sink: ? use 36 gauge or smaller diameter k type thermocouples. ? ensure that the thermocouple has been properly calibrated. ? attach the thermocouple bead or junction to the top surface of the package (case) in the center of the silicon die using high thermal conductivity cement. ? it is critical that the entire thermocouple lead be butted tightly to the exposed die. ? attach the thermocouple at a 0 angle if there is no interference with the thermocouple attach location or leads ( figure 13-7 ). this is the preferred method and is recommended for use with non- enhanced packages. 13.8.1.2 attaching the thermocouple (heat sink) the following approach is recommended to minimize measurement errors for attaching the thermocouple with heat sink: ? use 36 gauge or smaller diameter k-type thermocouples. ? ensure that the thermocouple is properly calibrated. ? attach the thermocouple bead or junction to the case?s top surface in the geometric center using high thermal conductivity cement. ? it is critical that the entire thermocouple lead be butted tightly against the case. ? attach the thermocouple at a 90 angle if there is no interference with the thermocouple attach location or leads (refer to figure 13-7 ). this is the preferred method and is recommended for use with packages with heat sinks. ? for testing purposes, a hole (no larger than 0.150" in diameter) must be drilled vertically through the center of the heat sink to route the thermocouple wires out. ? ensure there is no contact between the thermocouple cement and heat sink base. any contact affects the thermocouple reading. figure 13-7. technique for measuring tcase with 0 angle attachment, no heat sink
intel ? 82576eb gbe controller ? thermal design specifications intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 948 13.9 heat sink and attach suppliers 13.10 pcb guidelines the following general pcb design guidelines are recommended to maximize the thermal performance of fcbga packages: ? when connecting ground (thermal) vias to the ground planes, do not use thermal-relief patterns. figure 13-8. technique for measuring tcase with 90 angle attachment table 13-6. hint sink and attach suppliers part part number supplier contact information heatsink 728443-001 foxconn hon hai precision industry co ltd contact: susiey chen susiey.chen@foxconn.com retention mechanism c63585-0c1 cci chaun-choung tech. corp contact: monica chih 12f no123-1 hsing-de rd. sanchung, taipei, taiwan tel: 886-2-29952666 fax: 886-2-29958258 monica_chih@ccic.com.tw thermal interface pcm45f included with heatsink size = 20mm 2 honeywell north america technical contact: paula knoll 1349 moffett park dr. sunnyvale, ca 94089 cell: 1-858-705-1274 business: 858-279-2956 paula.knoll@honeywell.com
thermal design specifications ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 949 ? thermal-relief patterns are designed to limit heat transfer between the vias and the copper planes, thus constricting the heat flow path from the component to the ground planes in the pcb. ? as board temperature also has an effect on the thermal performance of the package, avoid placing the 82576 adjacent to high power dissipation devices. ? if airflow exists, locate the components in the mainstream of the airflow path for maximum thermal performance. avoid placing the components downstream, behind larger devices or devices with heat sinks that obstruct the air flow or supply excessively heated air. the above guidelines are not all inclusive and are defined to give you known, good design practices to maximize the thermal performance of the components.
intel ? 82576eb gbe controller ? thermal design specifications intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 950 note: this page intentionally left blank.
diagnostics ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 951 14.0 diagnostics 14.1 jtag test mode description the 82576 includes a jtag (tap) port compliant with the ieee standard test access port and boundary scan architecture 1149.1 specification. the tap controller is accessed serially through four dedicated pins: jtck, jtms, jtdi, and jtdo. jtms, jtdi, and jtdo operate synchronously with jtck. jtck is independent of all other device clocks. this interface can be used for test and debug purposes. system board interconnects can be dc tested using the boundary scan logic in pads. tap controller pins shows tap controller related pin descriptions. tap instructions supported describes the tap instructions supported. table 14-1. tap controller pins signal i/o description jtck in test clock input for the test logic defined by ieee1149.1. if utilizing jtag, connect to this signal ground through a 1 k ohm pull-down resistor. jtdi in test data input. serial test instructions and data are received by the test logic at this pin. if utilizing jtag, connect this signal to vcc33 through a 1 k ohm pull-up resistor. jtdo o/d test data output. the serial output for the test instructions and data from the test logic defined in ieee1149.1. if utilizing jtag, connect this signal to vcc33 through a 1 k ohm pull-up resistor. jtms in test mode select input. the signal received at jtms is decoded by the tap controller to control test operations. table 14-2. tap instructions supported instruction description comment bypass the bypass command selects the bypass register, a single bit register connected between tdi and tdo pins. this allows more rapid movement of test data to and from other components in the system. ieee 1149.1 std. instruction extest the extest instruction allows circuitry or wiring external to the devices to be tested. boundary-scan register cells at outputs are used to apply stimulus while boundary-scan cells at input pins are used to capture data. ieee 1149.1 std. instruction
intel ? 82576eb gbe controller ? diagnostics intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 952 sample / preload the sample/preload instruction is used to allow scanning of the boundary scan register without causing interference to the normal operation of the device. two functions can be performed by use of the sample/preload instruction. sample ? allows a snapshot of the data flowing into and out of a device to be taken without affecting the normal operation of the device. preload ? allows an initial pattern to be placed into the boundary scan register cells. this allows initial known data to be present prior to the selection of another boundary-scan test operation. ieee 1149.1 std. instruction idcode the idcode instruction is forced into the parallel output latches of the instruction register during the test-logic-reset tap state. this allows the device identification register to be selected by manipulation of the broadcast tms and tck signals for testing purposes, as well as by a conventional instruction register scan operation. the id code value for the 82576 a0 is 0x010c9013 (intel's vendor id = 0x13, device id = 0x10c9, rev id = 0x0) ieee 1149.1 std. instruction highz the highz instruction is used to force all outputs of the device (except tdo) into a high impedance state. this instruction shall select the bypass register to be connected between tdi and tdo in the shift-dr controller state. ieee 1149.1 std. instruction
models, symbols, testing options, schematics and checklists ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 953 15.0 models, symbols, testing options, schematics and checklists 15.1 models and symbols ibis, bsdl, and hspice modeling data is available from intel. 15.2 physical layer conformance testing physical layer conformance testing (also known as ieee testing) is a fundamental capability for all companies with ethernet lan products. if your company does not have the resources and equipment to perform these tests, consider contracting the tests to an outside facility. 15.3 schematics intel ? 82576eb gbe controller reference schematics are available on developer. see http:// developer.intel.com/products/ethernet/index.htm?iid=nc+ethernet . 15.4 checklists intel ? 82576eb gbe controller schematic and layout and placement checklists are available on developer. see http://developer.intel.com/products/ethernet/index.htm?iid=nc+ethernet .
intel ? 82576eb gbe controller ? models, symbols, testing options, schematics and checklists intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 954 note: this page intentionally left blank.
changes from the 82575 ? intel ? 82576eb gbe controller revision: 2.63 intel ? 82576eb gbe controller december 2011 datasheet 955 appendix a. changes from the 82575 this appendix summarizes the changes in the programming interface of legacy functionality in the 82576 relative to 82575. table a-1. changes in programming interface relative to 82575 feature registers description impact on legacy drivers general ctrl.vle this register only affects the vlan strip in rx it does not have any influence in the tx path in the 82576. none pba / pbs replaced by rxpbs, txpbs & swpbs registers. none assuming default values are kept. interrupts interrupt registers uses an allocation method like the 82598. msixpbm registers replaced by ivar registers. eicr & other layout changed. drivers using extended causes requires update to new mode. eitr changes the granularity to 1 ? s instead of 256ns. adds support for low latency interrupts moderation. drivers using eitr require update to new mode. rxcfg rxcfg bit not supported can not detect code words while in force link mode. can use the hw detection of non-auto-negotiation partner instead. receive rdfpcq removed receive data fifo packet count - not used by sw. none error bits in rx descriptor l2 error bits are all merged into the rxe error bits. the other l2 error bits (including ecc) are now reserved. none rxctl cpuid field expanded to 8 bits to support new dca standard drivers supporting dca should update to new register layout. rx queues registers moved to a different address space. aliases for q1- 3 at 82575 address space. none rxdctl threshold fields are 5 bits instead of 6 due to the reduction in descriptor cache size. do not program threshold bigger than cache size.
intel ? 82576eb gbe controller ? changes from the 82575 intel ? 82576eb gbe controller revision: 2.63 datasheet december 2011 956 receive filtering ffmt, ffvt, fflt, tfflt, ftft, registers replaced with fhft & ftft using the 82598?s filters layout. wakeup flex filter programming needs to be updated rctl removed receive descriptor minimum threshold size field - replaced by per queue srrctl.rdmts fields. default values are the same. none assuming default values are kept. rah changed the queue indication from queue number to pools bitmap. drivers supporting vmdq1.0 should change programming of rah. vmd_ctl default queue is now shared between the pools instead of default queue per pool, drivers supporting vmdq1.0 should change programming of vmd_ctl. mrqc added encoding for the multiple receive queues enable field. fields supported in 82575 are kept. none reta moved the queue index to bits 3:0 of each entry. no support for individual tables for each pool. drivers supporting rss should update table programming. vfqa0,1 replaced by vlvf registers use new registers for vlan queueing transmit txdctl priority bit removed, as a new arbitration scheme has been added to the 82576. driver using per tx queue priority should update to new arbiter. threshold fields are 5 bits instead of 6 due to the reduction in descriptor cache size. do not program threshold bigger than cache size. txctl cpuid field expanded to 8 bits to support new dca standard drivers supporting dca should update to new register layout. tx queues registers moved to a different address space. aliases for q1- 3 at 82575 address space. none tx contexts the 82576 allows 2 contexts per queue instead of 16 global contexts supported in 82575 change context indexing. tso interleaving tso flows may now be interleaved at the l2 level. limitation on header buffers layout. legacy serdes registers txcw, rxcw, sec legacy serdes mode removed - register removed. use new serdes mode new serdes pcs_lctl added possibility of hw based resolving of the flow control auto negotiation. controlled by the pcs_lctl.force flow control bit either use hw based an resolving or set this bit and use legacy sw based resolving. statistics crcerrs doesn?t count alignment errors anymore. counts rx errors. this register is now equivalent to the framecheckerror counter as defined by ieee 802.3. change mib & oid calculation. flow control fcal, fcah registers are read only none. table a-1. changes in programming interface relative to 82575 feature registers description impact on legacy drivers


▲Up To Search▲   

 
Price & Availability of AUXPWR

All Rights Reserved © IC-ON-LINE 2003 - 2022  

[Add Bookmark] [Contact Us] [Link exchange] [Privacy policy]
Mirror Sites :  [www.datasheet.hk]   [www.maxim4u.com]  [www.ic-on-line.cn] [www.ic-on-line.com] [www.ic-on-line.net] [www.alldatasheet.com.cn] [www.gdcy.com]  [www.gdcy.net]


 . . . . .
  We use cookies to deliver the best possible web experience and assist with our advertising efforts. By continuing to use this site, you consent to the use of cookies. For more information on cookies, please take a look at our Privacy Policy. X